Open Octoslav opened 1 week ago
@Octoslav are you saying that os.replace() raised by error by mistake? Or was there a concrete reason for it?
@Octoslav are you saying that os.replace() raised by error by mistake? Or was there a concrete reason for it?
As far as I know the race condition can occur: os.close()
is executed, but Windows still keeps the file open for a short time in some cases. So os.replace(local_filename, temp_filename)
throws an error.
shutil.move is a high level function and can handle such a case.
@jkhenning Hi!
Do you have any thoughts? The error is quite annoying for windows users in my team.
The solution seems OK to me, I'm just not comfortable with hiding errors in general 😄 Does this only happen in Windows? If so, I would prefer to have this behavior specific to windows, while keeping the previous behavior for other platforms
@jkhenning
Does this only happen in Windows?
Didn't managed to reproduced it on linux machines
I'm just not comfortable with hiding errors in general
I didn't hide the error, I just used an another instrument that works better on Windows
Any idea on the implications of this change in linux or other platforms?
@jkhenning
shutil.move
is multi-platformand works well everywhere
Describe the Bug
When uploading a dataset using
clearml-data sync
on Windows, the following error intermittently occurs:PermissionError: [WinError 5] Access is denied
.Investigation
The error is triggered on the following line in the code:
https://github.com/allegroai/clearml/blob/master/clearml/binding/artifacts.py#L1141C18-L1141C19.
This issue is unstable and difficult to reproduce consistently. However, it becomes a significant problem when working with large datasets. For instance, my team faced this error while attempting to upload a dataset containing 700 000 items. The process took approximately 5 hours to compute all file hashes, only for the
WinError 5
to unexpectedly appear.Steps to Reproduce
The following script simulates the aforementioned line of code and demonstrates the issue:
When running this script on multiple Windows machines, I observed about 6
PermissionError
exceptions over 100 000 iterations.Running the script as administrator does not resolve the issue.
Possible Solution
I suggest replacing the use of
os.replace(local_filename, temp_filename)
withshutil.move(local_filename, temp_filename)
. Both methods achieve the same functionality, butshutil.move
does not raise thePermissionError
on Windows.