Open PrimozGodec opened 3 years ago
Should we disable the possibility to save corpus to CSV, TAB, ... and allow only .pkl like it is made for sparse? Users are confused when they save corpus to csv and the discover that preprocessing is not stored together with the corpus.
This would disable saving the downloaded corpus from Twitter, Wikipedia and other similar widgets to csv. Not in favour of removing.
While I agree it is slightly confusing, I think it is common practice (in NLTK for example) to have a separate tokens object. I'd rather give a warning or describe this better in the docs.
Agree with you @ajdapretnar, I would definitely add a warning to the widget.
An idea: if Corpus on the input of Save Data, the widget raises a warning saying "To keep preprocessing save as pickle (.pckls)". Should be implemented in orange3.
@Rezabagheriloye reported in https://github.com/biolab/orange3/issues/5035:
I wanted to suggest the solution with pickling the Corpus and discovered two new issues:
csv
and the discover that preprocessing is not stored together with the corpus.