flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)
https://flairnlp.github.io/flair/
Other
13.96k stars 2.1k forks source link

Fine-tuning t5-base model raises an error #1661

Closed krzysztoffiok closed 1 year ago

krzysztoffiok commented 4 years ago

Hi,

I tried to fine-tune T5-base model on google colab and get this error

ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")

To be more specific where the error happens, it happens at the very moment when the training should start:

2020-06-03 12:07:25,877 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,877 Corpus: "Corpus: 4800 train + 1200 dev + 20630 test sentences" 2020-06-03 12:07:25,878 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,878 Parameters: 2020-06-03 12:07:25,878 - learning_rate: "3e-06" 2020-06-03 12:07:25,879 - mini_batch_size: "8" 2020-06-03 12:07:25,879 - patience: "3" 2020-06-03 12:07:25,879 - anneal_factor: "0.5" 2020-06-03 12:07:25,880 - max_epochs: "4" 2020-06-03 12:07:25,880 - shuffle: "True" 2020-06-03 12:07:25,880 - train_with_dev: "False" 2020-06-03 12:07:25,880 - batch_growth_annealing: "False" 2020-06-03 12:07:25,880 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,880 Model training base path: "semeval_data/model_sentiment_0" 2020-06-03 12:07:25,880 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,880 Device: cuda:0 2020-06-03 12:07:25,881 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,881 Embeddings storage mode: cpu 2020-06-03 12:07:25,883 ---------------------------------------------------------------------------------------------------- Traceback (most recent call last): File "./model_train.py", line 138, in shuffle=True, File "/usr/local/lib/python3.6/dist-packages/flair/trainers/trainer.py", line 349, in train loss = self.model.forward_loss(batch_step) File "/usr/local/lib/python3.6/dist-packages/flair/models/text_classification_model.py", line 142, in forward_loss scores = self.forward(data_points) File "/usr/local/lib/python3.6/dist-packages/flair/models/text_classification_model.py", line 98, in forward self.document_embeddings.embed(sentences) File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/base.py", line 59, in embed self._add_embeddings_internal(sentences) File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/document.py", line 91, in _add_embeddings_internal self._add_embeddings_to_sentences(batch) File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/document.py", line 136, in _add_embeddings_to_sentences else self.model(input_ids)[-1] File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, *kwargs) File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_t5.py", line 955, in forward use_cache=use_cache, File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_t5.py", line 674, in forward raise ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds") ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

To Reproduce Go to google colab, create a new project with gpu and do the following: !git clone https://github.com/krzysztoffiok/twitter_sentiment !pip3 install flair !pip3 install datatable

cd twitter_sentiment

!python3 ./semeval_data_splitter.py !python3 ./model_train.py --dataset=semeval --k_folds=5 --test_run=t5-base --fine_tune

Expected behavior the script should start training (fine tuning) a list of models, the first given is t5-base

Environment (please complete the following information): google colab GPU runtime

!nvidia-smi Wed Jun 3 12:11:08 2020
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.82 Driver Version: 418.67 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 | | N/A 36C P8 26W / 149W | 0MiB / 11441MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

!pip3 freeze returns:

absl-py==0.9.0 alabaster==0.7.12 albumentations==0.1.12 altair==4.1.0 asgiref==3.2.7 astor==0.8.1 astropy==4.0.1.post1 astunparse==1.6.3 atari-py==0.2.6 atomicwrites==1.4.0 attrs==19.3.0 audioread==2.1.8 autograd==1.3 Babel==2.8.0 backcall==0.1.0 beautifulsoup4==4.6.3 bleach==3.1.5 blessed==1.17.6 blis==0.4.1 bokeh==1.4.0 boto==2.49.0 boto3==1.13.19 botocore==1.16.19 Bottleneck==1.3.2 bpemb==0.3.0 branca==0.4.1 bs4==0.0.1 CacheControl==0.12.6 cachetools==3.1.1 catalogue==1.0.0 certifi==2020.4.5.1 cffi==1.14.0 chainer==6.5.0 chardet==3.0.4 click==7.1.2 cloudpickle==1.3.0 cmake==3.12.0 cmdstanpy==0.4.0 colorama==0.4.3 colorlover==0.3.0 community==1.0.0b1 contextlib2==0.5.5 convertdate==2.2.1 coverage==3.7.1 coveralls==0.5 crcmod==1.7 cufflinks==0.17.3 cupy-cuda101==6.5.0 cvxopt==1.2.5 cvxpy==1.0.31 cycler==0.10.0 cymem==2.0.3 Cython==0.29.19 daft==0.0.4 dask==2.12.0 dataclasses==0.7 datascience==0.10.6 datatable==0.10.1 decorator==4.4.2 defusedxml==0.6.0 Deprecated==1.2.10 descartes==1.1.0 dill==0.3.1.1 distributed==1.25.3 Django==3.0.6 dlib==19.18.0 docopt==0.6.2 docutils==0.15.2 dopamine-rl==1.0.5 earthengine-api==0.1.223 easydict==1.9 ecos==2.0.7.post1 editdistance==0.5.3 en-core-web-sm==2.2.5 entrypoints==0.3 ephem==3.7.7.1 et-xmlfile==1.0.1 fa2==0.3.5 fancyimpute==0.4.3 fastai==1.0.61 fastdtw==0.3.4 fastprogress==0.2.3 fastrlock==0.4 fbprophet==0.6 feather-format==0.4.1 featuretools==0.4.1 filelock==3.0.12 firebase-admin==4.1.0 fix-yahoo-finance==0.0.22 flair==0.5 Flask==1.1.2 folium==0.8.3 fsspec==0.7.4 future==0.16.0 gast==0.3.3 GDAL==2.2.2 gdown==3.6.4 gensim==3.6.0 geographiclib==1.50 geopy==1.17.0 gin-config==0.3.0 glob2==0.7 google==2.0.3 google-api-core==1.16.0 google-api-python-client==1.7.12 google-auth==1.7.2 google-auth-httplib2==0.0.3 google-auth-oauthlib==0.4.1 google-cloud-bigquery==1.21.0 google-cloud-core==1.0.3 google-cloud-datastore==1.8.0 google-cloud-firestore==1.7.0 google-cloud-language==1.2.0 google-cloud-storage==1.18.1 google-cloud-translate==1.5.0 google-colab==1.0.0 google-pasta==0.2.0 google-resumable-media==0.4.1 googleapis-common-protos==1.51.0 googledrivedownloader==0.4 graphviz==0.10.1 grpcio==1.29.0 gspread==3.0.1 gspread-dataframe==3.0.7 gym==0.17.2 h5py==2.10.0 HeapDict==1.0.1 holidays==0.9.12 html5lib==1.0.1 httpimport==0.5.18 httplib2==0.17.4 httplib2shim==0.0.3 humanize==0.5.1 hyperopt==0.1.2 ideep4py==2.0.0.post3 idna==2.9 image==1.5.32 imageio==2.4.1 imagesize==1.2.0 imbalanced-learn==0.4.3 imblearn==0.0 imgaug==0.2.9 importlib-metadata==1.6.0 imutils==0.5.3 inflect==2.1.0 intel-openmp==2020.0.133 intervaltree==2.1.0 ipykernel==4.10.1 ipython==5.5.0 ipython-genutils==0.2.0 ipython-sql==0.3.9 ipywidgets==7.5.1 itsdangerous==1.1.0 jax==0.1.68 jaxlib==0.1.47 jdcal==1.4.1 jedi==0.17.0 jieba==0.42.1 Jinja2==2.11.2 jmespath==0.10.0 joblib==0.15.1 jpeg4py==0.1.4 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==5.3.4 jupyter-console==5.2.0 jupyter-core==4.6.3 kaggle==1.5.6 kapre==0.1.3.1 Keras==2.3.1 Keras-Applications==1.0.8 Keras-Preprocessing==1.1.2 keras-vis==0.4.1 kiwisolver==1.2.0 knnimpute==0.1.0 langdetect==1.0.8 librosa==0.6.3 lightgbm==2.2.3 llvmlite==0.31.0 lmdb==0.98 lucid==0.3.8 LunarCalendar==0.0.9 lxml==4.2.6 Markdown==3.2.2 MarkupSafe==1.1.1 matplotlib==3.2.1 matplotlib-venn==0.11.5 missingno==0.4.2 mistune==0.8.4 mizani==0.6.0 mkl==2019.0 mlxtend==0.14.0 more-itertools==8.3.0 moviepy==0.2.3.5 mpld3==0.3 mpmath==1.1.0 msgpack==1.0.0 multiprocess==0.70.9 multitasking==0.0.9 murmurhash==1.0.2 music21==5.5.0 natsort==5.5.0 nbconvert==5.6.1 nbformat==5.0.6 networkx==2.4 nibabel==3.0.2 nltk==3.2.5 notebook==5.2.2 np-utils==0.5.12.1 numba==0.48.0 numexpr==2.7.1 numpy==1.18.4 nvidia-ml-py3==7.352.0 oauth2client==4.1.3 oauthlib==3.1.0 okgrade==0.4.3 opencv-contrib-python==4.1.2.30 opencv-python==4.1.2.30 openpyxl==2.5.9 opt-einsum==3.2.1 osqp==0.6.1 packaging==20.4 palettable==3.3.0 pandas==1.0.4 pandas-datareader==0.8.1 pandas-gbq==0.11.0 pandas-profiling==1.4.1 pandocfilters==1.4.2 parso==0.7.0 pathlib==1.0.1 patsy==0.5.1 pexpect==4.8.0 pickleshare==0.7.5 Pillow==7.0.0 pip-tools==4.5.1 plac==1.1.3 plotly==4.4.1 plotnine==0.6.0 pluggy==0.13.1 portpicker==1.3.1 prefetch-generator==1.0.1 preshed==3.0.2 prettytable==0.7.2 progressbar2==3.38.0 prometheus-client==0.8.0 promise==2.3 prompt-toolkit==1.0.18 protobuf==3.10.0 psutil==5.4.8 psycopg2==2.7.6.1 ptvsd==5.0.0a12 ptyprocess==0.6.0 py==1.8.1 pyarrow==0.14.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycocotools==2.0.0 pycparser==2.20 pydata-google-auth==1.1.0 pydot==1.3.0 pydot-ng==2.0.0 pydotplus==2.0.2 PyDrive==1.3.1 pyemd==0.5.1 pyglet==1.5.0 Pygments==2.1.3 pygobject==3.26.1 pymc3==3.7 PyMeeus==0.3.7 pymongo==3.10.1 pymystem3==0.2.0 PyOpenGL==3.1.5 pyparsing==2.4.7 pyrsistent==0.16.0 pysndfile==1.3.8 PySocks==1.7.1 pystan==2.19.1.1 pytest==5.4.3 python-apt==1.6.5+ubuntu0.2 python-chess==0.23.11 python-dateutil==2.8.1 python-louvain==0.14 python-slugify==4.0.0 python-utils==2.4.0 pytz==2018.9 PyWavelets==1.1.1 PyYAML==3.13 pyzmq==19.0.1 qtconsole==4.7.4 QtPy==1.9.0 regex==2019.12.20 requests==2.23.0 requests-oauthlib==1.3.0 resampy==0.2.2 retrying==1.3.3 rpy2==3.2.7 rsa==4.0 s3fs==0.4.2 s3transfer==0.3.3 sacremoses==0.0.43 scikit-image==0.16.2 scikit-learn==0.22.2.post1 scipy==1.4.1 screen-resolution-extra==0.0.0 scs==2.1.2 seaborn==0.10.1 segtok==1.5.10 Send2Trash==1.5.0 sentencepiece==0.1.91 setuptools-git==1.2 Shapely==1.7.0 simplegeneric==0.8.1 six==1.12.0 sklearn==0.0 sklearn-pandas==1.8.0 smart-open==2.0.0 snowballstemmer==2.0.0 sortedcontainers==2.1.0 spacy==2.2.4 Sphinx==1.8.5 sphinxcontrib-websupport==1.2.2 SQLAlchemy==1.3.17 sqlitedict==1.6.0 sqlparse==0.3.1 srsly==1.0.2 statsmodels==0.10.2 sympy==1.1.1 tables==3.4.4 tabulate==0.8.7 tbb==2020.0.133 tblib==1.6.0 tensorboard==2.2.2 tensorboard-plugin-wit==1.6.0.post3 tensorboardcolab==0.0.22 tensorflow==2.2.0 tensorflow-addons==0.8.3 tensorflow-datasets==2.1.0 tensorflow-estimator==2.2.0 tensorflow-gcs-config==2.1.8 tensorflow-hub==0.8.0 tensorflow-metadata==0.22.1 tensorflow-privacy==0.2.2 tensorflow-probability==0.10.0 termcolor==1.1.0 terminado==0.8.3 testpath==0.4.4 text-unidecode==1.3 textblob==0.15.3 textgenrnn==1.4.1 Theano==1.0.4 thinc==7.4.0 tifffile==2020.5.30 tokenizers==0.7.0 toolz==0.10.0 torch==1.5.0+cu101 torchsummary==1.5.1 torchtext==0.3.1 torchvision==0.6.0+cu101 tornado==4.5.3 tqdm==4.41.1 traitlets==4.3.3 transformers==2.11.0 tweepy==3.6.0 typeguard==2.7.1 typesentry==0.2.7 typing==3.6.6 typing-extensions==3.6.6 tzlocal==1.5.1 umap-learn==0.4.3 uritemplate==3.0.1 urllib3==1.24.3 vega-datasets==0.8.0 wasabi==0.6.0 wcwidth==0.1.9 webencodings==0.5.1 Werkzeug==1.0.1 widgetsnbextension==3.5.1 wordcloud==1.5.0 wrapt==1.12.1 xarray==0.15.1 xgboost==0.90 xkit==0.0.0 xlrd==1.1.0 xlwt==1.3.0 yellowbrick==0.9.1 zict==2.0.0 zipp==3.1.0

nightlessbaron commented 4 years ago

Did you solve the error? I am also facing the same bug

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stefan-it commented 2 years ago

I have a working solution for it, will prepare a PR for that soon, so re-opening it!

ataniz commented 2 years ago

I have a working solution for it, will prepare a PR for that soon, so re-opening it!

Hi @stefan-it , is there any updates on the PR? error still persists

Madhu000 commented 2 years ago

I am facing the same issue for mt5-small one.. Can anyone fix this if yes please your guidance is always welcome.. Thanks in advance.

stefan-it commented 2 years ago

Hi @ataniz and @Madhu000 ,

sorry for the late reply! I pushed a working version of encoder-only fine-tuning T5 models:

https://github.com/flairNLP/flair/pull/2896

Feel free to test it :hugs:

Madhu000 commented 2 years ago

When I am testing with this branch the same error is occurring.. Please help me out. Thanks in advance. Please find the following log

2022-08-08 20:25:36,116

2022-08-08 20:25:36,117 Corpus: "MultiCorpus: 644 train + 92 dev + 186 test sentences - ColumnCorpus Corpus: 644 train + 92 dev + 186 test sentences - /root/.flair/datasets/ner_masakhane/luo" 2022-08-08 20:25:36,117

2022-08-08 20:25:36,117 Parameters: 2022-08-08 20:25:36,117 - learning_rate: "0.000050" 2022-08-08 20:25:36,117 - mini_batch_size: "4" 2022-08-08 20:25:36,117 - patience: "3" 2022-08-08 20:25:36,117 - anneal_factor: "0.5" 2022-08-08 20:25:36,117 - max_epochs: "10" 2022-08-08 20:25:36,117 - shuffle: "True" 2022-08-08 20:25:36,117 - train_with_dev: "False" 2022-08-08 20:25:36,118 - batch_growth_annealing: "False" 2022-08-08 20:25:36,118

2022-08-08 20:25:36,118 Model training base path: "conll-03-t5-base" 2022-08-08 20:25:36,118

2022-08-08 20:25:36,118 Device: cuda:0 2022-08-08 20:25:36,118

2022-08-08 20:25:36,118 Embeddings storage mode: none 2022-08-08 20:25:36,118

Traceback (most recent call last): File "run_ner.py", line 158, in main() File "run_ner.py", line 147, in main weight_decay=training_args.weight_decay, File "/usr/local/lib/python3.7/dist-packages/flair/trainers/trainer.py", line 909, in fine_tune trainer_args, File "/usr/local/lib/python3.7/dist-packages/flair/trainers/trainer.py", line 500, in train loss = self.model.forward_loss(batch_step) File "/usr/local/lib/python3.7/dist-packages/flair/models/sequence_tagger_model.py", line 270, in forward_loss scores, gold_labels = self.forward(sentences) # type: ignore File "/usr/local/lib/python3.7/dist-packages/flair/models/sequence_tagger_model.py", line 282, in forward self.embeddings.embed(sentences) File "/usr/local/lib/python3.7/dist-packages/flair/embeddings/base.py", line 62, in embed self._add_embeddings_internal(data_points) File "/usr/local/lib/python3.7/dist-packages/flair/embeddings/base.py", line 766, in _add_embeddings_internal self._add_embeddings_to_sentences(expanded_sentences) File "/usr/local/lib/python3.7/dist-packages/flair/embeddings/base.py", line 692, in _add_embeddings_to_sentences hidden_states = self.model(input_ids, model_kwargs)[-1] File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/models/t5/modeling_t5.py", line 1438, in forward return_dict=return_dict, File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/models/t5/modeling_t5.py", line 932, in forward raise ValueError(f"You have to specify either {err_msg_prefix}input_ids or {err_msg_prefix}inputs_embeds") ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

On Mon, Aug 8, 2022 at 4:27 PM Stefan Schweter @.***> wrote:

Hi @ataniz https://github.com/ataniz and @Madhu000 https://github.com/Madhu000 ,

sorry for the late reply! I pushed a working version of encoder-only fine-tuning T5 models:

2896 https://github.com/flairNLP/flair/pull/2896

Feel free to test it 🤗

— Reply to this email directly, view it on GitHub https://github.com/flairNLP/flair/issues/1661#issuecomment-1207972418, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCHYN4O7TPV3EMWRMUATOLVYDR2BANCNFSM4NRTN36Q . You are receiving this because you were mentioned.Message ID: @.***>

stefan-it commented 2 years ago

Hi @Madhu000 ,

it seems that Flair in your virtual environment uses the installed 0.11 version (this can be seen in the logs, because flair/embeddings/base.py do not have a line 692 in latest master due to a recent refactoring). Here's a short snippet of how to use the T5 encoder fix branch:

pip3 uninstall flair

git clone https://github.com/flairNLP/flair.git
cd flair
git checkout add-t5-encoder-support
pip3 install -e .

Then you can try using it again :)

Madhu000 commented 2 years ago

Thanks, I'll check it out.

On Tue, Aug 9, 2022 at 2:17 AM Stefan Schweter @.***> wrote:

Hi @Madhu000 https://github.com/Madhu000 ,

it seems that Flair in your virtual environment uses the installed 0.11 version (this can be seen in the logs, because flair/embeddings/base.py do not have a line 692 in latest master https://github.com/flairNLP/flair/blob/master/flair/embeddings/base.py due to a recent refactoring). Here's a short snippet of how to use the T5 encoder fix branch:

pip3 uninstall flair

git clone https://github.com/flairNLP/flair.gitcd flair git checkout add-t5-encoder-support pip3 install -e .

Then you can try using it again :)

— Reply to this email directly, view it on GitHub https://github.com/flairNLP/flair/issues/1661#issuecomment-1208594516, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCHYN2TZSNBWVGYONIMVJ3VYFW5LANCNFSM4NRTN36Q . You are receiving this because you were mentioned.Message ID: @.***>

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.