PelicanPlatform / pelicanfs

An fsspec implementation that uses the pelican client
https://pelicanplatform.org/
Apache License 2.0
1 stars 4 forks source link

RuntimeError: This class is not fork-safe #59

Open Kristy-an opened 5 months ago

Kristy-an commented 5 months ago
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "<ipython-input-17-cd8dc175aedc>", line 41, in __getitem__
    sample = self.default_loader(path)
  File "<ipython-input-17-cd8dc175aedc>", line 33, in default_loader
    with self.fs.open(path, 'rb') as f:
  File "/usr/local/lib/python3.10/dist-packages/pelicanfs/core.py", line 552, in open
    data_url = sync(self.loop, self.get_origin_cache if self.directReads else self.get_working_cache, path)
  File "/usr/local/lib/python3.10/dist-packages/fsspec/asyn.py", line 328, in loop
    raise RuntimeError("This class is not fork-safe")
RuntimeError: This class is not fork-safe

From ChatGPT: The error you're encountering, RuntimeError: This class is not fork-safe, indicates that the PelicanFileSystem class is not safe to use with the default multiprocessing backend in PyTorch's DataLoader. This issue arises because certain libraries, like PelicanFileSystem, might not handle multiprocessing well, especially when using the fork method.

Kristy-an commented 5 months ago

Can add this code in the beginning to solve the issue.

import torch.multiprocessing as mp

# Set multiprocessing start method to 'spawn'
mp.set_start_method('spawn', force=True)