jxmorris12 / vec2text

utilities for decoding deep representations (like sentence embeddings) back to text
Other
673 stars 75 forks source link

PicklingError while running corrector training #18

Closed siebeniris closed 8 months ago

siebeniris commented 8 months ago

Hi, sorry for the long output of the error as below, dataset_map_multi_worker seem to be working fine for other places, so I am not sure what is the issue here.

Thanks for the help in advance!

train() called – resume-from_checkpoint = None
    [None] Saving hypotheses to path saves/.cache/inversion/ad2e50a2989171a5_hypotheses.cache
Precomputing hypotheses for data (num_proc=6):   0%|          | 0/392702 [00:11<?, ? examples/s]
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/xxx/xxx//vec2text/run.py", line 16, in <module>
    main()
  File "/home/xxx/xxx//vec2text/run.py", line 12, in main
    experiment.run()
  File "/home/xxx/xxx//vec2text/experiments.py", line 152, in run
    self.train()
  File "/home/xxx/xxx//vec2text/experiments.py", line 185, in train
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/home/xxx.local/lib/python3.10/site-packages/transformers/trainer.py", line 1555, in train
    return inner_training_loop(
  File "/home/xxx/xxx//vec2text/trainers/corrector.py", line 227, in _inner_training_loop
    self.precompute_hypotheses()
  File "/home/xxx/xxx//vec2text/trainers/corrector.py", line 212, in precompute_hypotheses
    self.train_dataset, train_cache_path = self._preprocess_dataset_hypotheses(
  File "/home/xxx/xxx//vec2text/trainers/corrector.py", line 166, in _preprocess_dataset_hypotheses
    dataset = dataset_map_multi_worker(
  File "/home/xxx/xxx//vec2text/utils/utils.py", line 119, in dataset_map_multi_worker
    return dataset.map(map_fn, *args, **kwargs)
  File "/home/xxx.local/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 591, in wrapper
    out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
  File "/home/xxx.local/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 556, in wrapper
    out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
  File "/home/xxx.local/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3181, in map
    for rank, done, content in iflatmap_unordered(
  File "/home/xxx.local/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 1417, in iflatmap_unordered
    [async_result.get(timeout=0.05) for async_result in async_results]
  File "/home/xxx.local/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 1417, in <listcomp>
    [async_result.get(timeout=0.05) for async_result in async_results]
  File "/home/xxx.local/lib/python3.10/site-packages/multiprocess/pool.py", line 774, in get
    raise self._value
  File "/home/xxx.local/lib/python3.10/site-packages/multiprocess/pool.py", line 540, in _handle_tasks
    put(task)
  File "/home/xxx.local/lib/python3.10/site-packages/multiprocess/connection.py", line 209, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/xxx.local/lib/python3.10/site-packages/multiprocess/reduction.py", line 54, in dumps
    cls(buf, protocol, *args, **kwds).dump(obj)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 418, in dump
    StockPickler.dump(self, obj)
  File "/usr/lib/python3.10/pickle.py", line 487, in dump
    self.save(obj)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 902, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 887, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1212, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.10/pickle.py", line 972, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.10/pickle.py", line 692, in save_reduce
    save(args)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 887, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1453, in save_instancemethod0
    pickler.save_reduce(MethodType, (obj.__func__, obj.__self__), obj=obj)
  File "/usr/lib/python3.10/pickle.py", line 692, in save_reduce
    save(args)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 887, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.10/pickle.py", line 717, in save_reduce
    save(state)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1212, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.10/pickle.py", line 972, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.10/pickle.py", line 717, in save_reduce
    save(state)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1212, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.10/pickle.py", line 972, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.10/pickle.py", line 717, in save_reduce
    save(state)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1212, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.10/pickle.py", line 972, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.10/pickle.py", line 713, in save_reduce
    self._batch_setitems(dictitems)
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python3.10/pickle.py", line 687, in save_reduce
    save(cls)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1812, in save_type
    _save_with_postproc(pickler, (_create_type, (
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1093, in _save_with_postproc
    pickler.save_reduce(*reduction, obj=obj)
  File "/usr/lib/python3.10/pickle.py", line 692, in save_reduce
    save(args)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 902, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1212, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.10/pickle.py", line 972, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1965, in save_function
    _save_with_postproc(pickler, (_create_function, (
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1093, in _save_with_postproc
    pickler.save_reduce(*reduction, obj=obj)
  File "/usr/lib/python3.10/pickle.py", line 692, in save_reduce
    save(args)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 902, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 902, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1812, in save_type
    _save_with_postproc(pickler, (_create_type, (
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1093, in _save_with_postproc
    pickler.save_reduce(*reduction, obj=obj)
  File "/usr/lib/python3.10/pickle.py", line 692, in save_reduce
    save(args)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/usr/lib/python3.10/pickle.py", line 902, in save_tuple
    save(element)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1212, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python3.10/pickle.py", line 972, in save_dict
    self._batch_setitems(obj.items())
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1965, in save_function
    _save_with_postproc(pickler, (_create_function, (
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 1107, in _save_with_postproc
    pickler._batch_setitems(iter(source.items()))
  File "/usr/lib/python3.10/pickle.py", line 998, in _batch_setitems
    save(v)
  File "/home/xxx.local/lib/python3.10/site-packages/dill/_dill.py", line 412, in save
    StockPickler.save(self, obj, save_persistent_id)
  File "/usr/lib/python3.10/pickle.py", line 589, in save
    self.save_global(obj, rv)
  File "/usr/lib/python3.10/pickle.py", line 1071, in save_global
    raise PicklingError(
_pickle.PicklingError: Can't pickle <built-in function clear_profiler_hooks>: it's not found as torch._C._dynamo.eval_frame.clear_profiler_hooks
siebeniris commented 8 months ago

It is dataset_map_multi_worker after all. It works when num_proc=1. Then precomputing takes an hour with a much smaller dataset. Is this some incompatibility issue with transformers? Similar issue as this one: https://discuss.huggingface.co/t/map-multiprocessing-issue/4085/2

Ariya12138 commented 8 months ago

毕竟是这样。它在以下情况下起作用。然后,使用更小的数据集进行预计算需要一个小时。这是变压器的不兼容问题吗?与此问题类似的问题:https://discuss.huggingface.co/t/map-multiprocessing-issue/4085/2`dataset_map_multi_worker``num_proc=1`

I met the same problem. And I also set num_proc=1.

jxmorris12 commented 8 months ago

Are you using Windows or Linux? Have you updated all the relevant packages? (try pip install accelerate dill datasets optimum transformers --upgrade)

siebeniris commented 8 months ago

@jxmorris12 thanks! That works!