martinpacesa / BindCraft

User friendly and accurate binder design pipeline
MIT License
288 stars 59 forks source link

Pandas EmptyDataError While Fixing Interface Residues #99

Open ahmedselim2017 opened 15 hours ago

ahmedselim2017 commented 15 hours ago

Hi, I was running multiple runs parallelly on multiple GPUs with the same output folder and one of the runs exited with the error below when fixing interface residues.

The runs continued for more than 3 days without an error, and this error has only occurred once. Given the infrequency of the error, is it possible that it is caused by a race condition where one run tries to read a file while the other one is using it?

Traceback (most recent call last):
  File "[BindCraft_path]/bindcraft.py", line 396, in <module>
    failure_df = pd.read_csv(failure_csv)
  File "[miniforge3_path]/envs/BindCraft_old/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1026, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "[miniforge3_path]/envs/BindCraft_old/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 620, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "[miniforge3_path]/envs/BindCraft_old/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1620, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "[miniforge3_path]/envs/BindCraft_old/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1898, in _make_engine
    return mapping[engine](f, **self.options)
  File "[miniforge3_path]/envs/BindCraft_old/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 93, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "parsers.pyx", line 581, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file
martinpacesa commented 14 hours ago

Yeah, that's probably it, I have seen it before but it only hapened when I ran like 50 jobs at once :D

ahmedselim2017 commented 12 hours ago

If you like, I can implement filelock to lock the log files while reading/writing, as a job crashing may cause wasted resources if unchecked.