Closed lrebscher closed 4 years ago
I will provide a minimal example in the next days. However, as it is a race condition it won't be reproducible on each run.
I'm happy to contribute to this library by fixing this bug once it is accepted. Thank you!
thanks for the report, sounds legit. I'm happy to review a PR!
Should be fixed by #359
Thanks!
Description
The creation of the dataset directory is not thread-safe and is subject to race condition.
This can sometimes result in the following uncaught exception:
FileExistsError: [Errno 17] File exists: '/root/.surprise_data/'
.The affected method is displayed below. The condition checking if the folder exists and if not creating the directory is subject to race conditions as
os.makedirs(folder)
will fail if the directory exists.File "/usr/local/lib/python3.7/site-packages/surprise/builtin_datasets.py", line 23, in get_dataset_dir
This problem and two possible solutions for it are described in https://stackoverflow.com/a/42545343 .
This error has been observed when using the library in an application served by
gunicorn
with multiplegthread
s.Steps/Code to Reproduce
TODO: provide minimal code example.
Expected Results
If the builtin dataset directory already exists, it will be ignored or
FileExistsError
will be caught and ignored.Actual Results
In a setup with multiple threads the library might throw a
FileExistsError
due to a race condition.Versions