Open JWuerfel opened 1 week ago
Could you elaborate what you mean with "It only fails for a specific dataset which [...] is not larger than other datasets that can be downloaded (and cached)."
The same code works perfectly with other datasets and it's just one that fails for me. We wondered if the dataset might be too large, but it works fine with bigger ones. This is for the dataset, I can't use:
Issue checklist
Description of the bug
Loading a certain dataset fails on windows. It only fails for a specific dataset which
The path in the error message exists/is created for the dataset in the cache folder up to
spark
, but the folderspark
is empty. I have tried deleting the folder for the dataset as well as the entire cache but it does not change anything.The error occurs whether the input path or RID is specified for the input.
Another user has had the same error in the past (also on windows) probably with a different dataset (although I have no more details).
Steps to reproduce this bug.
As it doesn't fail for all datasets and I don't know what makes the dataset it fails for different, it's difficult to descibe, but here is my code that fails for a specific dataset.
Log output
FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\username\.foundry-dev-tools\.cache\foundry-dev-tools\dataset-RID\ri.foundry.main.transaction.1234.parquet\spark\part-1234.snappy.parquet'
Additional context
No response
Operating System
Windows
Your python version
3.10.15