MIND-Lab / OCTIS

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)
MIT License
713 stars 102 forks source link

Feature request : Add option on the preprocessing to chose a custom text preprocessor. #16

Open espoirMur opened 3 years ago

espoirMur commented 3 years ago

What we have in the preprocessing step is already a good starting point, but we can do better by adding an option to define someone's custom preprocessing pipeline to handle what is not yet handled in the current preprocessing.

drob-xx commented 2 years ago

I'm wondering if you have more specifics on what you would like here and what the goals are? I took a quick look and it seems that you provide a pretty good pre-processing module and that if someone wanted to do it differently they could simply skip your pre-processor entirely and just provide a processed dataset. Am I missing something?