On Windows, when using run_finetuning it fails with cannot pickle 'builtins.CoreBPE' object

flesnuk commented 1 year ago

When running on Windows there is this error TypeError: cannot pickle 'builtins.CoreBPE' object

I only found this relevant thread while googling. https://discuss.pytorch.org/t/pytorch-windows-eoferror-ran-out-of-input-when-num-workers-0/25918

And by setting the num_workers to 0 in the dataloader class seems to work. I don't know the implications of this or if there is any way to fix the error. But it may be useful for someone with the same situation.

Traceback (most recent call last):
  File "D:\ai\whisper-finetuning\run_finetuning.py", line 282, in <module>
    main()
  File "D:\ai\whisper-finetuning\run_finetuning.py", line 271, in main
    main_loop(
  File "D:\ai\whisper-finetuning\run_finetuning.py", line 201, in main_loop
Traceback (most recent call last):
    min_loss = evaluate(model, dev_loader)
  File "<string>", line 1, in <module>
  File "D:\ai\whisper-finetuning\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "D:\ai\whisper-finetuning\run_finetuning.py", line 160, in evaluate
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 116, in spawn_main
    for x, y_in, y_out in tqdm(dev_loader):
  File "D:\ai\whisper-finetuning\venv\lib\site-packages\tqdm\std.py", line 1178, in __iter__
    exitcode = _main(fd, parent_sentinel)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\multiprocessing\spawn.py", line 126, in _main
    for obj in iterable:
  File "D:\ai\whisper-finetuning\venv\lib\site-packages\torch\utils\data\dataloader.py", line 444, in __iter__
EOFError: Ran out of input
    return self._get_iterator()
  File "D:\ai\whisper-finetuning\venv\lib\site-packages\torch\utils\data\dataloader.py", line 390, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "D:\ai\whisper-finetuning\venv\lib\site-packages\torch\utils\data\dataloader.py", line 1077, in __init__
    w.start()
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\multiprocessing\context.py", line 336, in _Popen
    return Popen(process_obj)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2800.0_x64__qbz5n2kfra8p0\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: cannot pickle 'builtins.CoreBPE' object

flesnuk commented 1 year ago

Since I only have 8GB of VRAM I can't run it either way, even with small model and --dev-batch-size 1. 😅

jumon commented 1 year ago

Thanks! I'm gonna look into it, but it may take a bit because I don't have a Windows machine. For training on small VRAM, using the tiny model or specifying --train-only-decoder might help.

flesnuk commented 1 year ago

Hi @jumon no problem, I don't full understand the issue, but using the num_workers=0 is working as far I can see.

For the VRAM I see an easy solution is using bitandbytes which I intend to open a PR today. Using Adam 8bit optimizer I can get to train small model with 8GB of VRAM in Windows. I will add a flag to make it optional using 8bit optimizer.

jumon commented 1 year ago

For the VRAM I see an easy solution is using bitandbytes which I intend to open a PR today.

That would be nice! Thanks!

jumon / whisper-finetuning

On Windows, when using run_finetuning it fails with cannot pickle 'builtins.CoreBPE' object #3