huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
129.82k stars 25.79k forks source link

generic text classification with TensorFlow error (AttributeError: 'TFTrainingArguments' object has no attribute 'args') #7351

Closed c-col closed 3 years ago

c-col commented 3 years ago

Environment info

Who can help

@jplu

Information

Model I am using (Bert, XLNet ...): bert-base-multilingual-uncased

The problem arises when using:

The tasks I am working on is:

To reproduce

Steps to reproduce the behavior:

  1. Call run_tf_text_classification.py with flags from the example in the "Run generic text classification script in TensorFlow" section of examples/text-classification:
    python run_tf_text_classification.py \
    --train_file train.csv \
    --dev_file dev.csv \ 
    --test_file test.csv \ 
    --label_column_id 0 \ 
    --model_name_or_path bert-base-multilingual-uncased \
    --output_dir model \
    --num_train_epochs 4 \
    --per_device_train_batch_size 16 \
    --per_device_eval_batch_size 32 \
    --do_train \
    --do_eval \
    --do_predict \
    --logging_steps 10 \
    --evaluate_during_training \
    --save_steps 10 \
    --overwrite_output_dir \
    --max_seq_length 128
  2. Error is encountered:
Traceback (most recent call last):
  File "run_tf_text_classification.py", line 283, in <module>
    main()
  File "run_tf_text_classification.py", line 199, in main
    training_args.n_replicas,
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/file_utils.py", line 936, in wrapper
    return func(*args, **kwargs)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/training_args_tf.py", line 180, in n_replicas
    return self._setup_strategy.num_replicas_in_sync
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/file_utils.py", line 914, in __get__
    cached = self.fget(obj)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/file_utils.py", line 936, in wrapper
    return func(*args, **kwargs)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/transformers/training_args_tf.py", line 122, in _setup_strategy
    if self.args.xla:
AttributeError: 'TFTrainingArguments' object has no attribute 'args'
  1. If the logger.info call is commented out (lines 197-202), the above error is prevented but another error is encountered:
Traceback (most recent call last):
  File "run_tf_text_classification.py", line 282, in <module>
    main()
  File "run_tf_text_classification.py", line 221, in main
    max_seq_length=data_args.max_seq_length,
  File "run_tf_text_classification.py", line 42, in get_tfds
    ds = datasets.load_dataset("csv", data_files=files)
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/datasets/load.py", line 604, in load_dataset
    **config_kwargs,
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/datasets/builder.py", line 158, in __init__
    **config_kwargs,
  File "/home/qd_team/qdmr_gpu/smart_env/lib/python3.6/site-packages/datasets/builder.py", line 269, in _create_builder_config
    for key in sorted(data_files.keys()):
TypeError: '<' not supported between instances of 'NamedSplit' and 'NamedSplit'

Here is a pip freeze:

absl-py==0.10.0
astunparse==1.6.3
cachetools==4.1.1
certifi==2020.6.20
chardet==3.0.4
click==7.1.2
dataclasses==0.7
datasets==1.0.2
dill==0.3.2
filelock==3.0.12
gast==0.3.3
google-auth==1.21.3
google-auth-oauthlib==0.4.1
google-pasta==0.2.0
grpcio==1.32.0
h5py==2.10.0
idna==2.10
importlib-metadata==2.0.0
joblib==0.16.0
Keras-Preprocessing==1.1.2
Markdown==3.2.2
numpy==1.18.5
oauthlib==3.1.0
opt-einsum==3.3.0
packaging==20.4
pandas==1.1.2
protobuf==3.13.0
pyarrow==1.0.1
pyasn1==0.4.8
pyasn1-modules==0.2.8
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2020.1
regex==2020.7.14
requests==2.24.0
requests-oauthlib==1.3.0
rsa==4.6
sacremoses==0.0.43
scipy==1.4.1
sentencepiece==0.1.91
six==1.15.0
tensorboard==2.3.0
tensorboard-plugin-wit==1.7.0
tensorflow==2.3.0
tensorflow-estimator==2.3.0
termcolor==1.1.0
tokenizers==0.8.1rc2
tqdm==4.49.0
transformers==3.2.0
urllib3==1.25.10
Werkzeug==1.0.1
wrapt==1.12.1
xxhash==2.0.0
zipp==3.2.0

Expected behavior

Model begins to train on custom dataset.

jplu commented 3 years ago

Hello!

This is fixed in master.

sunnyville01 commented 3 years ago

@jplu Sorry, but I'm facing the same issue, and have version 3.2 installed. Can you please elaborate on how I might fix this? Thanks.

jplu commented 3 years ago

@sunnyville01 Just install the version on master with pip install git+https://github.com/huggingface/transformers.git

sunnyville01 commented 3 years ago

@jplu Thanks, that fixed it.

astromad commented 3 years ago

I am still facing this issue on colab with !pip install git+https://github.com/huggingface/transformers.git

`--------------------------------------------------------------------------- AttributeError Traceback (most recent call last)

in () 17 learning_rate=LEARNING_RATE 18 ) ---> 19 with training_argsTF.strategy.scope(): 20 modelTF = TFAutoModelForSequenceClassification.from_pretrained( 21 model_args['model_name'], 4 frames /usr/local/lib/python3.6/dist-packages/transformers/training_args_tf.py in _setup_strategy(self) 120 logger.info("Tensorflow: setting up strategy") 121 --> 122 if self.args.xla: 123 tf.config.optimizer.set_jit(True) 124 AttributeError: 'TFTrainingArguments' object has no attribute 'args'`
jplu commented 3 years ago

Something must be wrong with your install process, because this bug is fixed in master.

astromad commented 3 years ago

My bad, did not notice "requirements already met message", updated to !pip install --upgrade git+https://github.com/huggingface/transformers.git

No more issue! Sorry .

Santosh-Gupta commented 3 years ago

Something must be wrong with your install process, because this bug is fixed in master.

The error seems to persist with me. I installed using !pip install git+https://github.com/huggingface/transformers.git and got the same error TypeError: '<' not supported between instances of 'NamedSplit' and 'NamedSplit'

Here's is a colab notebook, you can do runtime-> run all , and see the output of the last cell.

https://colab.research.google.com/drive/1r3XCKYA8RBtfYmU2jqHVJT-uTt1ii04S?usp=sharing

pvcastro commented 3 years ago

@jplu I'm also getting the same error TypeError: '<' not supported between instances of 'NamedSplit' and 'NamedSplit', and I also ran the colab from @Santosh-Gupta and the error happened too. My local environment is also based on transformer's master branch.

jplu commented 3 years ago

@pvcastro Can you open a new issue please with all the details to be able for us to reproduce it. This thread is closed and about a different one.