PiotrNawrot / nanoT5

Fast & Simple repository for pre-training and fine-tuning T5-style models
Apache License 2.0
970 stars 74 forks source link

AttributeError: Can't pickle local object 'IterableDataset.map.<locals>.<lambda>' #20

Closed turian closed 1 year ago

turian commented 1 year ago

First of all, thank you for this rigorous work. As another low-budget researcher, I applaud you.

I was curious to see how this would perform on Apple Metal (MPS). However, even in pure CPU mode I can't get it to run on OSX.

Any tips?

(nanoT5) joseph@JosephsacStudio nanoT5 % python3 -m nanoT5.main device=cpu model.compile=False precision=no
[2023-07-27 22:23:37,823][Main][INFO] - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu

Mixed precision type: no

[2023-07-27 22:23:37,823][Main][INFO] - Working directory is /Users/joseph/dev/nanoT5/logs/2023-07-27/22-23-37-
loading configuration file config.json from cache at /Users/joseph/.cache/huggingface/hub/models--google--t5-v1_1-base/snapshots/b5fc947a416ea3cb079532cb3c2bbadeb7f800fc/config.json
Model config T5Config {
  "_name_or_path": "google/t5-v1_1-base",
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "d_ff": 2048,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dense_act_fn": "gelu_new",
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "is_gated_act": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_max_distance": 128,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "transformers_version": "4.31.0",
  "use_cache": true,
  "vocab_size": 32128
}

loading configuration file config.json from cache at /Users/joseph/.cache/huggingface/hub/models--google--t5-v1_1-base/snapshots/b5fc947a416ea3cb079532cb3c2bbadeb7f800fc/config.json
Model config T5Config {
  "_name_or_path": "google/t5-v1_1-base",
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "d_ff": 2048,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dense_act_fn": "gelu_new",
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "is_gated_act": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_max_distance": 128,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "transformers_version": "4.31.0",
  "use_cache": true,
  "vocab_size": 32128
}

loading file spiece.model from cache at /Users/joseph/.cache/huggingface/hub/models--google--t5-v1_1-base/snapshots/b5fc947a416ea3cb079532cb3c2bbadeb7f800fc/spiece.model
loading file tokenizer.json from cache at None
loading file added_tokens.json from cache at None
loading file special_tokens_map.json from cache at /Users/joseph/.cache/huggingface/hub/models--google--t5-v1_1-base/snapshots/b5fc947a416ea3cb079532cb3c2bbadeb7f800fc/special_tokens_map.json
loading file tokenizer_config.json from cache at /Users/joseph/.cache/huggingface/hub/models--google--t5-v1_1-base/snapshots/b5fc947a416ea3cb079532cb3c2bbadeb7f800fc/tokenizer_config.json
loading configuration file config.json from cache at /Users/joseph/.cache/huggingface/hub/models--google--t5-v1_1-base/snapshots/b5fc947a416ea3cb079532cb3c2bbadeb7f800fc/config.json
Model config T5Config {
  "_name_or_path": "google/t5-v1_1-base",
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "d_ff": 2048,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dense_act_fn": "gelu_new",
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "is_gated_act": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_max_distance": 128,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "transformers_version": "4.31.0",
  "use_cache": true,
  "vocab_size": 32128
}

You are using the legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This means that tokens that come after special tokens will not be properly handled. We recommend you to read the related pull request available at https://github.com/huggingface/transformers/pull/24565
loading configuration file config.json from cache at /Users/joseph/.cache/huggingface/hub/models--google--t5-v1_1-base/snapshots/b5fc947a416ea3cb079532cb3c2bbadeb7f800fc/config.json
Model config T5Config {
  "_name_or_path": "google/t5-v1_1-base",
  "architectures": [
    "T5ForConditionalGeneration"
  ],
  "d_ff": 2048,
  "d_kv": 64,
  "d_model": 768,
  "decoder_start_token_id": 0,
  "dense_act_fn": "gelu_new",
  "dropout_rate": 0.1,
  "eos_token_id": 1,
  "feed_forward_proj": "gated-gelu",
  "initializer_factor": 1.0,
  "is_encoder_decoder": true,
  "is_gated_act": true,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "num_decoder_layers": 12,
  "num_heads": 12,
  "num_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "relative_attention_max_distance": 128,
  "relative_attention_num_buckets": 32,
  "tie_word_embeddings": false,
  "transformers_version": "4.31.0",
  "use_cache": true,
  "vocab_size": 32128
}

Error executing job with overrides: ['device=cpu', 'model.compile=False', 'precision=no']
Traceback (most recent call last):
  File "/Users/joseph/dev/nanoT5/nanoT5/main.py", line 65, in main
    train(model, train_dataloader, test_dataloader, accelerator,
  File "/Users/joseph/dev/nanoT5/nanoT5/utils/train_utils.py", line 186, in train
    for batch_id, batch in enumerate(train_dataloader, start=1):
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/site-packages/accelerate/data_loader.py", line 550, in __iter__
    main_iterator = super().__iter__()
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 441, in __iter__
    return self._get_iterator()
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 388, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1042, in __init__
    w.start()
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
    reduction.dump(process_obj, fp)
  File "/opt/homebrew/anaconda3/envs/nanoT5/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'IterableDataset.map.<locals>.<lambda>'

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
PiotrNawrot commented 1 year ago

I managed to reproduce it on OSX and I can't tell you what's the reason behind this issues, but my guess it's somewhere in between Huggingface and Python for OSX. Setting data.num_workers=0 solves the issue : ). If you find a way to fix it, then I would highly appreciate a comment or a merge request.

Glad that you liked my repo, good luck with your research!