huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
132.34k stars 26.36k forks source link

Bug when trying to run examples in the directory tensorflow/text-classification module run_text-classification #29584

Closed Humbulani1234 closed 6 months ago

Humbulani1234 commented 6 months ago

System Info

 `transformers` version: 4.35.2
- Platform: Linux-6.5.0-25-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.19.4
- Safetensors version: 0.4.0
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): not installed (NA)
- Tensorflow version (GPU?): 2.15.0 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

Who can help?

No Response

Information

Tasks

Reproduction

While I was trying out the examples in the transformers source code, the examples directory, I experienced an issue while I was trying to run the example tensorflow.textclassification.run_text_classification - the bash command line as: python run_text_classification.py --model_name_or_path distilbert-base-cased --train_file training_data.json --validation_file validation_data.json --output_dir output, accordingly as per the testclassifiication.README.md instructions.

Here is the error:

traceback (most recent call last):
  File "/home/humbulani/transformers-main/examples/tensorflow/text-classification/run_text_classification.py", line 600, in <module>
    main()
  File "/home/humbulani/transformers-main/examples/tensorflow/text-classification/run_text_classification.py", line 527, in main
    model.compile(optimizer=optimizer, metrics=metrics) # Removed optimizer=optimizer positional argument
  File "/home/humbulani/django/env/lib/python3.10/site-packages/transformers/modeling_tf_utils.py", line 1531, in compile
    super().compile(
  File "/home/humbulani/django/env/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/home/humbulani/django/env/lib/python3.10/site-packages/keras/src/optimizers/__init__.py", line 329, in get
    raise ValueError(
ValueError: Could not interpret optimizer identifier: None

I then decided to read through the source code starting from the run_text_classification.main. This function calls compile.model which has a signature: model.compile(optimizer=optimizer, metrics=metrics), and function accepts optimizer as one of the keyword arguments, and depending on the below cross section block of code from the main function optimizer may take the value None:

...

if training_args.do_train:
            num_train_steps = len(tf_data["train"]) * training_args.num_train_epochs
            if training_args.warmup_steps > 0:
                num_warmup_steps = training_args.warmup_steps
            elif training_args.warmup_ratio > 0:
                num_warmup_steps = int(num_train_steps * training_args.warmup_ratio)
            else:
                num_warmup_steps = 0

            optimizer, schedule = create_optimizer(
                init_lr=training_args.learning_rate,
                num_train_steps=num_train_steps,
                num_warmup_steps=num_warmup_steps,
                adam_beta1=training_args.adam_beta1,
                adam_beta2=training_args.adam_beta2,
                adam_epsilon=training_args.adam_epsilon,
                weight_decay_rate=training_args.weight_decay,
                adam_global_clipnorm=training_args.max_grad_norm,
            )
        else:
            optimizer = None
...

if the else block of code is executed and optimizer is indeed set to None, it then overrides the assignment of the optimizer, which is equal to "rmsprop"' by default, argument in the function:keras.src.engine.training.compile` where its signature is given by:

@traceback_utils.filter_traceback
    def compile(
        self,
        optimizer="rmsprop",
        loss=None,
        metrics=None,
        loss_weights=None,
        weighted_metrics=None,
        run_eagerly=None,
        steps_per_execution=None,
        jit_compile=None,
        pss_evaluation_shards=0,
        **kwargs,
    ):

This then leads to all sorts of issues. I then decided to assign optimizer in the else block of code to: optimizer = None or "rmsprop" and the code worked perfectly. I'm sure there could be better fix than this hack, which maybe submitted with a PR should this prove to be a genuine bug.

The Stacktrace for function calls:

/home/humbulani/transformers-main/examples/tensorflow/text-classification/run_text_classification.py(595)<module>()
-> main()
  /home/humbulani/transformers-main/examples/tensorflow/text-classification/run_text_classification.py(522)main()
-> model.compile(optimizer=optimizer, metrics=metrics)
  /home/humbulani/django/env/lib/python3.10/site-packages/transformers/modeling_tf_utils.py(1531)compile()
-> super().compile(
  /home/humbulani/django/env/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py(65)error_handler()
-> return fn(*args, **kwargs)
  /home/humbulani/django/env/lib/python3.10/site-packages/keras/src/engine/training.py(784)compile()
-> self.optimizer = self._get_optimizer(optimizer)
  /home/humbulani/django/env/lib/python3.10/site-packages/keras/src/engine/training.py(848)_get_optimizer()
-> return tf.nest.map_structure(_get_single_optimizer, optimizer)
  /home/humbulani/django/env/lib/python3.10/site-packages/tensorflow/python/util/nest.py(631)map_structure()
-> return nest_util.map_structure(
  /home/humbulani/django/env/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py(1066)map_structure()
-> return _tf_core_map_structure(func, *structure, **kwargs)
  /home/humbulani/django/env/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py(1106)_tf_core_map_structure()
-> [func(*x) for x in entries],
  /home/humbulani/django/env/lib/python3.10/site-packages/tensorflow/python/util/nest_util.py(1106)<listcomp>()
-> [func(*x) for x in entries],
  /home/humbulani/django/env/lib/python3.10/site-packages/keras/src/engine/training.py(839)_get_single_optimizer()
-> opt = optimizers.get(opt)
> /home/humbulani/django/env/lib/python3.10/site-packages/keras/src/optimizers/__init__.py(277)get()
-> """Retrieves a Keras Optimizer instance.

Environment

absl-py==1.4.0
aiohttp==3.9.1
aiosignal==1.3.1
alabaster==0.7.13
anyio==4.1.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
array-record==0.5.0
arrow==1.3.0
asgiref==3.7.2
asttokens==2.4.1
astunparse==1.6.3
async-lru==2.0.4
async-timeout==4.0.3
attrs==23.1.0
autokeras==1.1.0
Babel==2.13.1
beautifulsoup4==4.12.2
black==23.12.1
bleach==6.1.0
bottle==0.12.25
cachetools==5.3.2
certifi==2023.7.22
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
comm==0.2.0
contourpy==1.1.0
coverage==7.4.3
crispy-bootstrap5==0.7
cycler==0.11.0
datasets==2.15.0
debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
dill==0.3.7
Django==4.2.4
django-bootstrap-v5==1.0.11
django-clearcache==1.2.1
django-cors-headers==4.3.1
django-crispy-forms==2.0
django-pdb==0.6.2
django-widget-tweaks==1.5.0
dm-tree==0.1.8
docstring-to-markdown==0.13
docutils==0.17.1
etils==1.5.2
evaluate==0.4.1
exceptiongroup==1.2.0
executing==2.0.1
fastjsonschema==2.19.0
filelock==3.13.1
flatbuffers==23.5.26
fonttools==4.42.1
fqdn==1.5.1
frozenlist==1.4.0
fsspec==2023.10.0
gast==0.5.4
gevent==23.7.0
google-auth==2.23.4
google-auth-oauthlib==1.0.0
google-pasta==0.2.0
googleapis-common-protos==1.61.0
graphviz==0.20.1
greenlet==2.0.2
grpcio==1.59.2
gunicorn==21.2.0
h5py==3.10.0
huggingface==0.0.1
huggingface-hub==0.19.4
hypothesis==6.98.2
idna==3.4
imagesize==1.4.1
importlib-resources==6.1.1
iniconfig==2.0.0
instaviz==0.6.0
ipdb==0.13.13
ipykernel==6.27.1
ipython==8.18.1
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.2
joblib==1.3.2
json5==0.9.14
jsonpointer==2.4
jsonschema==4.18.0
jsonschema-specifications==2023.11.2
jupyter-events==0.9.0
jupyter-lsp==2.2.1
jupyter_client==8.6.0
jupyter_core==5.5.0
jupyter_server==2.12.1
jupyter_server_terminals==0.4.4
jupyterlab==4.0.9
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.2
kagglehub==0.2.0
keras==2.15.0
keras-core==0.1.7
keras-nlp==0.8.2
keras-tuner==1.4.7
kiwisolver==1.4.5
kt-legacy==1.0.5
libclang==16.0.6
Markdown==3.5.1
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.7.2
matplotlib-inline==0.1.6
mdurl==0.1.2
mistune==3.0.2
ml-dtypes==0.2.0
mpmath==1.3.0
multidict==6.0.4
multiprocess==0.70.15
mypy-extensions==1.0.0
namex==0.0.7
nbclient==0.9.0
nbconvert==7.12.0
nbformat==5.9.2
nest-asyncio==1.5.8
netron==7.3.3
networkx==3.2.1
nltk==3.8.1
notebook==7.0.6
notebook_shim==0.2.3
numpy==1.25.2
nvidia-cublas-cu12==12.2.5.6
nvidia-cuda-cupti-cu12==12.2.142
nvidia-cuda-nvcc-cu12==12.2.140
nvidia-cuda-nvrtc-cu12==12.2.140
nvidia-cuda-runtime-cu12==12.2.140
nvidia-cudnn-cu12==8.9.4.25
nvidia-cufft-cu12==11.0.8.103
nvidia-curand-cu12==10.3.3.141
nvidia-cusolver-cu12==11.5.2.141
nvidia-cusparse-cu12==12.1.2.141
nvidia-nccl-cu12==2.16.5
nvidia-nvjitlink-cu12==12.2.140
oauthlib==3.2.2
opencv-python==4.8.1.78
opt-einsum==3.3.0
overrides==7.4.0
packaging==23.1
pandas==2.0.3
pandocfilters==1.5.0
parso==0.8.3
pathspec==0.12.1
patsy==0.5.3
pexpect==4.9.0
Pillow==10.0.0
platformdirs==4.1.0
pluggy==1.3.0
prometheus-client==0.19.0
promise==2.3
prompt-toolkit==3.0.41
protobuf==3.20.3
psutil==5.9.6
ptyprocess==0.7.0
pure-eval==0.2.2
pyarrow==14.0.1
pyarrow-hotfix==0.6
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
Pygments==2.16.1
Pympler==1.0.1
pyparsing==3.0.9
PyQt6==6.6.1
PyQt6-Qt6==6.6.2
PyQt6-sip==13.6.0
pytest==8.0.0
pytest-cov==4.1.0
python-dateutil==2.8.2
python-json-logger==2.0.7
python-lsp-black==2.0.0
python-lsp-jsonrpc==1.1.2
python-lsp-server==1.9.0
python-version==0.0.2
pytidylib==0.2.3
pytz==2023.3
PyYAML==6.0.1
pyzmq==25.1.2
referencing==0.32.0
regex==2023.10.3
requests==2.31.0
requests-oauthlib==1.3.1
responses==0.18.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rich==13.7.0
rpds-py==0.13.2
rsa==4.9
safetensors==0.4.0
scikit-learn==1.3.0
scipy==1.11.2
seaborn==0.12.2
Send2Trash==1.8.2
sentencepiece==0.2.0
shiboken2==5.15.2.1
shiboken6==6.5.2
six==1.16.0
SMPy==1.0.3
sniffio==1.3.0
snowballstemmer==2.2.0
sortedcontainers==2.4.0
soupsieve==2.4.1
Sphinx==4.5.0
sphinxcontrib-applehelp==1.0.4
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.1
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
spin==0.8
sqlparse==0.4.4
stack-data==0.6.3
statsmodels==0.14.0
sympy==1.12
tabulate==0.9.0
tensorboard==2.15.2
tensorboard-data-server==0.7.2
tensorflow==2.15.0.post1
tensorflow-estimator==2.15.0
tensorflow-hub==0.16.1
tensorflow-io-gcs-filesystem==0.36.0
tensorflow-text==2.15.0
tensorrt==8.5.3.1
termcolor==2.3.0
terminado==0.18.0
tf-keras==2.15.0
threadpoolctl==3.2.0
tinycss2==1.2.1
tokenizers==0.15.0
toml==0.10.2
tomli==2.0.1
tornado==6.4
tqdm==4.66.1
traitlets==5.14.0
transformers==4.35.2
triton==2.1.0
types-python-dateutil==2.8.19.14
typing_extensions==4.7.1
tzdata==2023.3
ujson==5.9.0
uri-template==1.3.0
urllib3==2.0.7
wcwidth==0.2.12
webcolors==1.13
webencodings==0.5.1
websocket-client==1.7.0
Werkzeug==3.0.1
whitenoise==6.5.0
wrapt==1.14.1
xxhash==3.4.1
yarl==1.9.3
zipp==3.17.0
zope.event==5.0
zope.interface==6.0

Expected behavior

The module run_text_classification was expected to execute normally and construct the model in the output directory as per the README.md instructions.

amyeroberts commented 6 months ago

cc @Rocketknight1

Rocketknight1 commented 6 months ago

Hi @Humbulani1234, there are two issues here. The first is that TF is indeed throwing errors when optimizer=None, which used to work. I opened a PR to fix it at #29597.

You may also wish to add the arguments --do_train and --do_eval to your invocation of the example. I've updated the README to include them!