Open sugibuchi opened 1 week ago
I see your point. A context exit should certainly make the file closed, and calling it twice during del is pretty bad (I'll comment on that issue in a moment). However, I wonder whether there's space in the scheme you propose (with finally
) for what to do when the final upload fails. Should other things be cleaned up at the time (buffer/cache, etc)?
However, I wonder whether there's space in the scheme you propose (with finally) for what to do when the final upload fails. Should other things be cleaned up at the time (buffer/cache, etc)?
I think a better implementation is to mark a file object "closed" immediately after flush()
regardless of its result, as this flush()
is supposed to be the last operation for the file object to touch the file.
def close(self):
try:
...
else:
if not self.forced:
self.flush(force=True)
finally:
self.closed = True
self.buffer = None
self.cache = None
if self.fs is not None:
self.fs.invalidate_cache(self.path)
self.fs.invalidate_cache(self.fs._parent(self.path))
Then we should release resources and perform other operations, such as cleaning, etc. after marking the file object closed
.
With this modification, we can ensure that a concrete file object class inheriting AbstractBufferedFile
always releases its internal resources after closing a file (regardless of whether it succeeded or not) using the following idiom.
class MyFile(AbstractBufferedFile):
...
def close(self):
try:
super().close()
finally:
if not self.closed:
release_some_expensive_resources()
The current implementation of
close()
inAbstractBufferedFile
is like the following.https://github.com/fsspec/filesystem_spec/blob/2024.9.0/fsspec/spec.py#L2022-L2041
This method calls
flush()
to finalize file writing. However, the file will continue to be considered as "open" in case of exceptions inflush()
as the code does not reachself.closed = True
.If
close()
is called again, the file object will try to flush data again. This can lead to unexpected side effects, and it does not satisfy the convention defined byIOBase.close()
.As I reported in #1685, the garbage collection also calls this
close()
. Therefore, even if we explicitly close a file object once, itsclose()
is called at least twice ifflush()
throws an exception. In addition, the fileclose()
function's idempotence is generally an expected characteristic in many programming languages.To make the behaviour of file objects simpler and more predictable in case of errors, it would be better always to mark
self.closed = True
regardless the result offlush()
.