aai-institute / lakefs-spec

An fsspec implementation for the lakeFS project.
http://lakefs-spec.org/
Apache License 2.0
37 stars 4 forks source link

Failed `put_file` uploads cause exceptions during Python interpreter exit #218

Closed nicholasjng closed 6 months ago

nicholasjng commented 6 months ago

Describe the bug

As mentioned in the logs, Python garbage collection of a file results in an upload attempt, which should not happen.

The result will be the same, but this time it happens in the interpreter exit, which is really not good.

Culprit being the call to self.close() inside AbstractBufferedFile.__del__. I'm not sure that this is fixable when autocommit=True, so it might be best to discard the buffer in LakeFSFile.commit if sys.exc_info is populated.

Steps to reproduce

from fsspec.callbacks import TqdmCallback
from lakefs_spec import LakeFSFileSystem

def upload():
    fs = LakeFSFileSystem()
    res2 = fs.put_file(".sandbox/mnist.py", "asdfghjkl/main/mnist.py", precheck=False)  # <- all that matters is that the repo does not exist

if __name__ == "__main__":
    upload()

Expected behaviour

Raise an exception, and exit.

Logs, screens, other evidence of a bug

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/src/lakefs_spec/spec.py", line 744, in put_file
    with self.wrapped_api_call(rpath=rpath):
  File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/src/lakefs_spec/spec.py", line 200, in wrapped_api_call
    raise translate_lakefs_error(e, rpath=rpath, message=message, set_cause=set_cause)
FileNotFoundError: 404 repository not found: 'quickstart/main/mnist.py'

vvvvvvvvvvvvvvvvvvvv # !!!
python-BaseException
Exception ignored in: <function AbstractBufferedFile.__del__ at 0x1061ebec0>
Traceback (most recent call last):
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/venv/lib/python3.11/site-packages/fsspec/spec.py", line 1952, in __del__
    self.close()
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/venv/lib/python3.11/site-packages/fsspec/spec.py", line 1930, in close
    self.flush(force=True)
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/src/lakefs_spec/spec.py", line 949, in flush
    if self._upload_chunk(final=force) is not False:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/src/lakefs_spec/spec.py", line 877, in _upload_chunk
    self.commit()
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/src/lakefs_spec/spec.py", line 888, in commit
    with self.fs.wrapped_api_call(rpath=self.path):
  File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/nicholasjunge/Workspaces/python/lakefs-spec/src/lakefs_spec/spec.py", line 200, in wrapped_api_call
    raise translate_lakefs_error(e, rpath=rpath, message=message, set_cause=set_cause)
FileNotFoundError: 404 repository not found: 'quickstart/main/mnist.py'
  0%|          | 0/8033 [00:04<?, ?it/s]

Your operating system

macOS

Python version

Python 3.11.6

lakeFS-spec version

main@HEAD

lakeFS server version

v1.3.1

lakeFS SDK version

v1.3.0