scikit-learn / communication

Issues and content related to communicating about scikit-learn
BSD 3-Clause "New" or "Revised" License
3 stars 1 forks source link

Promote series of tutorials on text analytics #25

Open ArturoAmorQ opened 2 years ago

ArturoAmorQ commented 2 years ago

For twitter:

Did you know you can do NLP with scikit-learn? Learn the basics of text vectorizers with this mini tutorial (1/3)

https://scikit-learn.org/dev/auto_examples/text/plot_hashing_vs_dict_vectorizer.html


Learn the basics on classification of text documents with this mini tutorial (2/3)

https://scikit-learn.org/dev/auto_examples/text/plot_document_classification_20newsgroups.html


Learn the basics of text documents clustering with this mini tutorial (3/3)

https://scikit-learn.org/dev/auto_examples/text/plot_document_clustering.html

ogrisel commented 2 years ago

Maybe we should not use the term NLP too much. Natural Language Processing typically involve more advanced operations than just document classification, clustering and topic models. In particular, the NLP community is typically more interested in sentence-level analysis (e.g. named entity detection, syntactic parsing, logical entailment...).

To communicate about those examples I would rather use terms such as "Document classification and clustering" which is more specific.

ArturoAmorQ commented 2 years ago

To avoid pointing users towards the dev version of the doc, we agreed on a small committee to wait until the next release.