Open lorenzoh opened 2 years ago
I've hit this issue on my Windows 10 PC when trying to download the imagenette2-160 dataset or any other fast.ai dataset.
julia> load(datarecipes()["imagenette2-160"])
ERROR: IOError: rm("C:\\Users\\OKON SAMUEL\\.julia\\datadeps\\fastai-imagenette2-160"): resource busy or locked (EBUSY)
Stacktrace:
[1] uv_error
@ .\libuv.jl:97 [inlined]
[2] rm(path::String; force::Bool, recursive::Bool)
@ Base.Filesystem .\file.jl:299
[3] checkfor_mv_cp_cptree(src::String, dst::String, txt::String; force::Bool)
@ Base.Filesystem .\file.jl:323
[4] #mv#17
@ .\file.jl:411 [inlined]
[5] (::FastAI.Datasets.var"#8#9")(f::String)
@ FastAI.Datasets C:\Users\OKON SAMUEL\.julia\packages\FastAI\sjHxr\src\datasets\fastaidatasets.jl:135
...
I think its due to the use of mv(src, dst; force)
with force=true
here which removes dst
if it already exists. This would lead to an error on windows because the dst
folder is open by another process here. A similar scenerio occurs here.
Is there any reason for making use of temp
folders above.
Thanks for the detailed report of the issue! It is not inherently necessary to use a temporary folder but I didn't figure out a less clunky solution. What it does is copy the files that are unpacked into a subfolder in the datadep directory to the datadep directory itself. For example, instead of having .../datadeps/imagenette/imagenette/train/cls/*.jpg
you have .../datadeps/imagenette/train/cls/*.jpg
If you happen to find a better solution that doesn't error on Windows, I'd be happy to review a PR!
I'm currently using the following code below on my window PC and it all works.
function DataDeps.DataDep(d::FastAIDataset)
return DataDep(
"fastai-$(d.datadepname)",
"""
"$(d.name)" from the fastai dataset repository (https://course.fast.ai/datasets)
$(d.description)
Download size: $(d.size)
""",
"$(ROOT_URL)$(d.subfolder)/$(d.name).$(d.extension)",
d.checksum,
post_fetch_method=function (f)
DataDeps.unpack(f)
extracted = readdir(pwd())[1]
pwd()
end,
)
end
and it creates .../datadeps/imagenette/train/cls/*.jpg
as required. see here. I'll open a PR as soon as I'm free.
That looks promising, thanks for taking the time!
CI fails on Windows at the moment due to file permission errors that occur when downloading datadeps. See for example this run: https://github.com/FluxML/FastAI.jl/runs/6899007003?check_suite_focus=true