TromboneDavies / PolarOps

0 stars 0 forks source link

Bigrams????? Are we using them???? #79

Open divilian opened 2 years ago

divilian commented 2 years ago

Turns out that we are, in two different places, doing versions of the same thing:

if we have a CountVectorizer, call:

vectorizer.fit_transform(all_threads).toarray()

if we have a Tokenizer, on which we have called .git_on_texts(), call:

tokenizer.texts_to_matrix(threads, mode=METHOD)

Only the first of these actually cares about useBigrams. (!!)