TypeError: '>' not supported between instances of 'NoneType' and 'int'

prabhat-123 commented 3 years ago

Environment info

transformers version:
Platform:
Python version:
PyTorch version (GPU?):
Tensorflow version (GPU?):
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help

Information

Model I am using (Bert, XLNet ...):

The problem arises when using:

[ ] the official example scripts: (give details below)
[ ] my own modified scripts: (give details below)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[ ] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

1. 2. 3.

Expected behavior

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

ChristophKnapp commented 8 months ago

I'm running into this problem when I run the english to romania translation example. I'm not aware that I modified anything in the script. It fits the model up to the first epoch then it throws this error.

2023-11-13 15:47:58.542480: I tensorflow/core/util/port.cc:111] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2023-11-13 15:47:58.564058: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2023-11-13 15:47:58.564080: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2023-11-13 15:47:58.564097: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-11-13 15:47:58.568038: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 11/13/2023 15:47:59 - INFO - main - Training/evaluation parameters TFTrainingArguments( _n_gpu=-1, adafactor=False, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, auto_find_batch_size=False, bf16=False, bf16_full_eval=False, data_seed=None, dataloader_drop_last=False, dataloader_num_workers=0, dataloader_pin_memory=True, ddp_backend=None, ddp_broadcast_buffers=None, ddp_bucket_cap_mb=None, ddp_find_unused_parameters=None, ddp_timeout=1800, debug=[], deepspeed=None, disable_tqdm=False, dispatch_batches=None, do_eval=True, do_predict=False, do_train=True, eval_accumulation_steps=None, eval_delay=0, eval_steps=None, evaluation_strategy=no, fp16=False, fp16_backend=auto, fp16_full_eval=False, fp16_opt_level=O1, fsdp=[], fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}, fsdp_min_num_params=0, fsdp_transformer_layer_cls_to_wrap=None, full_determinism=False, gcp_project=None, gradient_accumulation_steps=1, gradient_checkpointing=False, gradient_checkpointing_kwargs=None, greater_is_better=None, group_by_length=False, half_precision_backend=auto, hub_always_push=False, hub_model_id=None, hub_private_repo=False, hub_strategy=every_save, hub_token=, ignore_data_skip=False, include_inputs_for_metrics=False, include_tokens_per_second=False, jit_mode_eval=False, label_names=None, label_smoothing_factor=0.0, learning_rate=5e-05, length_column_name=length, load_best_model_at_end=False, local_rank=-1, log_level=passive, log_level_replica=warning, log_on_each_node=True, logging_dir=/workspace/transformer/results/runs/Nov13_15-47-59_workstation-bluechip-BUSINESSline-individu, logging_first_step=False, logging_nan_inf_filter=True, logging_steps=500, logging_strategy=steps, lr_scheduler_kwargs={}, lr_scheduler_type=linear, max_grad_norm=1.0, max_steps=-1, metric_for_best_model=None, mp_parameters=, neftune_noise_alpha=None, no_cuda=False, num_train_epochs=3.0, optim=adamw_torch, optim_args=None, output_dir=/workspace/transformer/results, overwrite_output_dir=True, past_index=-1, per_device_eval_batch_size=16, per_device_train_batch_size=16, poly_power=1.0, prediction_loss_only=False, push_to_hub=False, push_to_hub_model_id=None, push_to_hub_organization=None, push_to_hub_token=, ray_scope=last, remove_unused_columns=True, report_to=['tensorboard'], resume_from_checkpoint=None, run_name=/workspace/transformer/results, save_on_each_node=False, save_safetensors=True, save_steps=500, save_strategy=steps, save_total_limit=None, seed=42, skip_memory_metrics=True, split_batches=False, tf32=None, torch_compile=False, torch_compile_backend=None, torch_compile_mode=None, torchdynamo=None, tpu_metrics_debug=False, tpu_name=None, tpu_num_cores=None, tpu_zone=None, use_cpu=False, use_ipex=False, use_legacy_prediction_loop=False, use_mps_device=False, warmup_ratio=0.0, warmup_steps=0, weight_decay=0.0, xla=False, ) Loading Dataset Infos from /.cache/huggingface/modules/datasets_modules/datasets/wmt16/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227 Overwrite dataset info from restored data version if exists. Loading Dataset info from /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227 11/13/2023 15:48:01 - INFO - datasets.info - Loading Dataset Infos from /.cache/huggingface/modules/datasets_modules/datasets/wmt16/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227 11/13/2023 15:48:01 - INFO - datasets.builder - Overwrite dataset info from restored data version if exists. 11/13/2023 15:48:01 - INFO - datasets.info - Loading Dataset info from /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227 11/13/2023 15:48:01 - INFO - datasets.builder - Found cached dataset wmt16 (/.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227) 11/13/2023 15:48:01 - INFO - datasets.info - Loading Dataset info from /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227 Found cached dataset wmt16 (/.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227) Loading Dataset info from /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227 loading configuration file config.json from cache at /.cache/huggingface/hub/models--t5-small/snapshots/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/config.json Model config T5Config { "_name_or_path": "t5-small", "architectures": [ "T5ForConditionalGeneration" ], "classifier_dropout": 0.0, "d_ff": 2048, "d_kv": 64, "d_model": 512, "decoder_start_token_id": 0, "dense_act_fn": "relu", "dropout_rate": 0.1, "eos_token_id": 1, "feed_forward_proj": "relu", "initializer_factor": 1.0, "is_encoder_decoder": true, "is_gated_act": false, "layer_norm_epsilon": 1e-06, "model_type": "t5", "n_positions": 512, "num_decoder_layers": 6, "num_heads": 8, "num_layers": 6, "output_past": true, "pad_token_id": 0, "relative_attention_max_distance": 128, "relative_attention_num_buckets": 32, "task_specific_params": { "summarization": { "early_stopping": true, "length_penalty": 2.0, "max_length": 200, "min_length": 30, "no_repeat_ngram_size": 3, "num_beams": 4, "prefix": "summarize: " }, "translation_en_to_de": { "early_stopping": true, "max_length": 300, "num_beams": 4, "prefix": "translate English to German: " }, "translation_en_to_fr": { "early_stopping": true, "max_length": 300, "num_beams": 4, "prefix": "translate English to French: " }, "translation_en_to_ro": { "early_stopping": true, "max_length": 300, "num_beams": 4, "prefix": "translate English to Romanian: " } }, "transformers_version": "4.36.0.dev0", "use_cache": true, "vocab_size": 32128 }

loading file spiece.model from cache at /.cache/huggingface/hub/models--t5-small/snapshots/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/spiece.model loading file tokenizer.json from cache at /.cache/huggingface/hub/models--t5-small/snapshots/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/tokenizer.json loading file added_tokens.json from cache at None loading file special_tokens_map.json from cache at None loading file tokenizer_config.json from cache at /.cache/huggingface/hub/models--t5-small/snapshots/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/tokenizer_config.json Loading cached processed dataset at /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227/cache-164eb734af318539.arrow Loading cached processed dataset at /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227/cache-442e2020e92ebe8e.arrow Tensorflow: setting up strategy 11/13/2023 15:48:01 - INFO - datasets.arrow_dataset - Loading cached processed dataset at /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227/cache-164eb734af318539.arrow 11/13/2023 15:48:01 - INFO - datasets.arrow_dataset - Loading cached processed dataset at /.cache/huggingface/datasets/wmt16/ro-en/1.0.0/746749a11d25c02058042da7502d973ff410e73457f3d305fc1177dc0e8c4227/cache-442e2020e92ebe8e.arrow 2023-11-13 15:48:01.416190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1886] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 8825 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:01:00.0, compute capability: 8.6 loading weights file model.safetensors from cache at /.cache/huggingface/hub/models--t5-small/snapshots/df1b051c49625cf57a3d0d8d3863ed4d13564fe4/model.safetensors Generate config GenerationConfig { "decoder_start_token_id": 0, "eos_token_id": 1, "pad_token_id": 0 }

2023-11-13 15:48:01.656874: I tensorflow/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory Loaded 60,506,624 parameters in the TF 2.0 model. All PyTorch model weights were used when initializing TFT5ForConditionalGeneration.

All the weights of TFT5ForConditionalGeneration were initialized from the PyTorch model. If your task is similar to the task the model of the checkpoint was trained on, you can already use TFT5ForConditionalGeneration for predictions without further training. You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the __call__ method is faster than using a method to encode the text followed by a call to the pad method to get a padded encoding. No loss specified in compile() - the model's internal loss computation will be used as the loss. Don't panic - this is a common way to train TensorFlow models in Transformers! To disable this behaviour please pass a loss argument, or explicitly pass loss=None if you do not want your model to compute a loss. You can also specify loss='auto' to get the internal loss without printing this info string. 11/13/2023 15:48:04 - INFO - main - Running training 11/13/2023 15:48:04 - INFO - main - Num examples = 610320 11/13/2023 15:48:04 - INFO - main - Num Epochs = 3.0 11/13/2023 15:48:04 - INFO - main - Instantaneous batch size per device = 16 11/13/2023 15:48:04 - INFO - main - Total train batch size = 16 11/13/2023 15:48:04 - INFO - main - Total optimization steps = 114435 Epoch 1/3 2023-11-13 15:48:13.749879: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f01b9364620 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: 2023-11-13 15:48:13.749896: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce RTX 3060, Compute Capability 8.6 2023-11-13 15:48:13.752234: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var MLIR_CRASH_REPRODUCER_DIRECTORY to enable. 2023-11-13 15:48:13.759242: I tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:442] Loaded cuDNN version 8700 2023-11-13 15:48:13.802724: I ./tensorflow/compiler/jit/device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process. 38145/38145 [==============================] - ETA: 0s - loss: 0.6117Generate config GenerationConfig { "decoder_start_token_id": 0, "eos_token_id": 1, "pad_token_id": 0 }

Traceback (most recent call last): File "/workspace/transformer/run_translation.py", line 733, in main() File "/workspace/transformer/run_translation.py", line 693, in main history = model.fit(tf_train_dataset, epochs=int(training_args.num_train_epochs), callbacks=callbacks) File "/workspace/transformer/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler raise e.with_traceback(filtered_tb) from None File "/workspace/transformer/lib/python3.10/site-packages/transformers/keras_callbacks.py", line 223, in on_epoch_end predictions = self.generation_function(generation_inputs, attention_mask=attention_mask) File "/tmp/autograph_generated_fileg5wrw6ci.py", line 13, in tfgenerationfunction retval = ag.converted_call(ag.ld(self).model.generate, (ag.ld(inputs),), dict(attention_mask=ag.ld(attention_mask), **ag.ld(self).generate_kwargs), fscope) File "/tmp/autograph_generated_fileqqh0lf7s.py", line 437, in tfgenerate is_beam_gen_mode = ag.and_(lambda : ag.not_(ag.ld(is_contrastive_search_gen_mode)), lambda : ag.and_(lambda : ag.ld(generation_config).num_beams > 1, lambda : ag.ld(generation_config).do_sample is False)) File "/tmp/autograph_generated_fileqqh0lf7s.py", line 437, in is_beam_gen_mode = ag.and_(lambda : ag_.not(ag.ld(is_contrastive_search_gen_mode)), lambda : ag.and_(lambda : ag.ld(generation_config).num_beams > 1, lambda : ag.ld(generation_config).do_sample is False)) File "/tmp/__autograph_generated_fileqqh0lf7s.py", line 437, in is_beam_gen_mode = ag.and_(lambda : ag.not_(ag.ld(is_contrastive_search_gen_mode)), lambda : ag.and_(lambda : ag.ld(generation_config).num_beams > 1, lambda : ag__.ld(generation_config).do_sample is False)) TypeError: in user code:

File "/workspace/transformer/lib/python3.10/site-packages/transformers/keras_callbacks.py", line 202, in generation_function  *
    return self.model.generate(inputs, attention_mask=attention_mask, **self.generate_kwargs)
File "/workspace/transformer/lib/python3.10/site-packages/transformers/generation/tf_utils.py", line 884, in generate  *
    is_beam_gen_mode = (

TypeError: '>' not supported between instances of 'NoneType' and 'int'

Process finished with exit code 1

amyeroberts commented 8 months ago

@ChristophKnapp Thanks for opening a new issue. Linking here for reference #27505

huggingface / transformers