FluxML / FastAI.jl

Repository of best practices for deep learning in Julia, inspired by fastai
https://fluxml.ai/FastAI.jl
MIT License
585 stars 51 forks source link

Windows CI failure #238

Open lorenzoh opened 2 years ago

lorenzoh commented 2 years ago

CI fails on Windows at the moment due to file permission errors that occur when downloading datadeps. See for example this run: https://github.com/FluxML/FastAI.jl/runs/6899007003?check_suite_focus=true

OkonSamuel commented 2 years ago

I've hit this issue on my Windows 10 PC when trying to download the imagenette2-160 dataset or any other fast.ai dataset.

julia> load(datarecipes()["imagenette2-160"])
ERROR: IOError: rm("C:\\Users\\OKON SAMUEL\\.julia\\datadeps\\fastai-imagenette2-160"): resource busy or locked (EBUSY)
Stacktrace:
  [1] uv_error
    @ .\libuv.jl:97 [inlined]
  [2] rm(path::String; force::Bool, recursive::Bool)
    @ Base.Filesystem .\file.jl:299
  [3] checkfor_mv_cp_cptree(src::String, dst::String, txt::String; force::Bool)
    @ Base.Filesystem .\file.jl:323
  [4] #mv#17
    @ .\file.jl:411 [inlined]
  [5] (::FastAI.Datasets.var"#8#9")(f::String)
    @ FastAI.Datasets C:\Users\OKON SAMUEL\.julia\packages\FastAI\sjHxr\src\datasets\fastaidatasets.jl:135
...

I think its due to the use of mv(src, dst; force) with force=true here which removes dst if it already exists. This would lead to an error on windows because the dst folder is open by another process here. A similar scenerio occurs here. Is there any reason for making use of temp folders above.

lorenzoh commented 2 years ago

Thanks for the detailed report of the issue! It is not inherently necessary to use a temporary folder but I didn't figure out a less clunky solution. What it does is copy the files that are unpacked into a subfolder in the datadep directory to the datadep directory itself. For example, instead of having .../datadeps/imagenette/imagenette/train/cls/*.jpg you have .../datadeps/imagenette/train/cls/*.jpg

If you happen to find a better solution that doesn't error on Windows, I'd be happy to review a PR!

OkonSamuel commented 2 years ago

I'm currently using the following code below on my window PC and it all works.

function DataDeps.DataDep(d::FastAIDataset)
    return DataDep(
        "fastai-$(d.datadepname)",
        """
        "$(d.name)" from the fastai dataset repository (https://course.fast.ai/datasets)

        $(d.description)

        Download size: $(d.size)
        """,
        "$(ROOT_URL)$(d.subfolder)/$(d.name).$(d.extension)",
        d.checksum,
        post_fetch_method=function (f)
            DataDeps.unpack(f)
            extracted = readdir(pwd())[1]
            pwd()
        end,
    )
end

and it creates .../datadeps/imagenette/train/cls/*.jpg as required. see here. I'll open a PR as soon as I'm free.

lorenzoh commented 2 years ago

That looks promising, thanks for taking the time!