TutteInstitute / vectorizers

Vectorizers for a range of different data types
BSD 3-Clause "New" or "Revised" License
93 stars 23 forks source link

EMTCV skip_grams #54

Closed cjweir closed 3 years ago

cjweir commented 3 years ago

Added skip_grams to EM TCV and moved str_to_bytes to the utils.... and my Black undid some of the changes that Leland had done for some reason to a couple files; not many but a few.

lmcinnes commented 3 years ago

I'll just merge this for now. I had some questions for Colin regarding the directionality of ngrams in the skip_ngrams, but we can deal with those later.