Closed dvsrepo closed 3 years ago
Hi @dvsrepo ,
In order to participate hacktoberfest21
, can I take this one? Thanks
Of course, feel free to ask questions! I've just assigned it to you
Hi @sakares ,
In this issue, we can focus on TextClassification
. The main idea would be:
import rubrix as rb
# 1. Load the dataset from Rubrix (for testing purposes you can log a dataset from Hugging Face see example below)
train_dataset = rb.load("my_dataset")
# we might do some train-test splitting first to create validation and test sets
# 2. Transform dataset(s) and save as csv
# train dataset is a Pandas dataframe: we might need to transform it into something readable by flair's CSVClassificationCorpus and then save it to csv
# transformed_train_dataset = apply some post-processing
transformed_train_dataset.to_csv('train.csv')
# 3. Read the with CSVClassificationCorpus
from flair.datasets import CSVClassificationCorpus
corpus = CSVClassificationCorpus
# 4. from here you should be able to follow https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_7_TRAINING_A_MODEL.md#training-a-text-classification-model
For testing purposes, you can log a text classification dataset from Hugging Face, see some examples here: https://rubrix.readthedocs.io/en/stable/tutorials/01-huggingface.html#Text-classification-with-the-tweet_eval-dataset-(Emoji-classification)
Hi @dvsrepo
Thanks for point out what I am looking for now.
I have just successfully reproduced works on flair tutorial in https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_7_TRAINING_A_MODEL.md#training-a-text-classification-model with glove DocumentPoolEmbeddings
(Just for quick experiment on my local machine without GPU)
and about to search how to convert rubrix dataset
into flair.datasets
format.
Also, I can run the docker-compose up for rubrix annotating system and play around already.
Hope I could PR back in few days.
Thanks so much @sakares, don't hesitate to ask here or open issues to report what you might find along the way.
About the conversion to flair.datasets format, what I mentioned above (export to csv) is kind of hack, ideally it would be cool to create a Flair Dataset directly from Python (pandas, or dictionaries) but could not find anything with a quick exploration of flairs code.
Also, I forgot to mention that if you are working on a Jupyter notebook and would like to share the results as a tutorial that's also cool and I could open a new issue so you could contribute that as a by-product too. We are going to include the authors for each tutorial so we could include your name, links, etc.
Hi @dvsrepo
I am new to Sphinx and after I follow this command cd docs; make html
it thrown following errors
Exception occurred:
File "/opt/homebrew/lib/python3.9/site-packages/nbconvert/exporters/templateexporter.py", line 607, in get_template_names
raise ValueError('No template sub-directory with name %r found in the following paths:\n\t%s' % (base_template, paths))
ValueError: No template sub-directory with name 'rst' found in the following paths:
/Users/sakares/Library/Jupyter
/opt/homebrew/opt/python@3.9/Frameworks/Python.framework/Versions/3.9/share/jupyter
/usr/local/share/jupyter
/usr/share/jupyter
The full traceback has been saved in /var/folders/hq/4yhdf6b93rq908q9tt9w07fh0000gn/T/sphinx-err-67_4f1bz.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
make: *** [html] Error 2
Do you have any ideas about this? Thanks!
Hi @sakares ,
Are you running this from the terminal inside the rubrix folder?
Yes, in the rubrix/docs
path
Maybe @dcfidalgo can help
Hm, maybe a long shot, but could you try to install pip install nbconvert==5.6.1
to see if this fixes your issue? following https://stackoverflow.com/questions/62431121/nbconvert-valueerror-no-template-sub-directory-with-name-rst-found-in-the-fo
Thanks, it works! 🙂 I think I probably PR back by this weekend
Current works:
rb.log
and rb.load
from rubrix datasource TextClassifier
in Flair from rb.load
and saved csv fileTo do:
text-classification
docstoken-classification
docson going PR #442
This is awesome @sakares !! Thank you! Have a nice weekend
Just finished #442 , please feel free to review/comment if I miss something @dvsrepo Thanks!
Implemented in #442
Similar to the training example for Hugging Face: https://rubrix.readthedocs.io/en/stable/guides/cookbook.html#Training