GlobalMaksimum / sadedegel

A General Purpose NLP library for Turkish
http://sadedegel.ai
MIT License
93 stars 15 forks source link

Make Pre-processing options work for PreTrainedVectorizer #307

Open dafajon opened 2 years ago

dafajon commented 2 years ago

Currently get_pretrained_embeddings, get_bert_embeddings work on the raw form of the document. As a result preprocessing settings do not apply to the text that goes into the transformer based vectorizers.