Open rofinn opened 2 years ago
Yes, I have some thoughts on this:
Some options:
Do you mean Artifacts.jl in Pkg.jl? FWIW, I do think the flexibility (e.g., fetch_method
, post_fetch_method
) and macro are probably worth keeping in our use case.
Yeah, is Artfiacts part of Pkg right now? I thought it was a seperate stdlib.
I do think the flexibility (e.g., fetch_method, post_fetch_method) and macro are probably worth keeping in our use case.
Agreed. But option 2 or 3 might let us still do that.
I think maybe the dynamic artifact creation in OhMyArtifacts.jl might work to achieve option 3. but I haven't looked closely. (cc @chengchingwen)
I think it's doable but would take some effort to integrate the two. There are some problems and behavior differences need to be resolved. Things like:
OTOH using Scratch.jl might probably be the most effortless and behavior consistent approach (though 1. 2. remains), we can just store each DataDep in a scratchspace folder and bind the lifetime to the caller project with get_scratch!(DataDeps, datadep_name, THAT_PACKAGE)
and somehow allow DataDeps.register
or DataDep constructor to pass in the caller package. Then when that package get GCed, the scratchspace will also be removed.
@oxinabox I just tag a new release for OhMyArtifacts.jl. It should be able to handle directory now. I can try to make a PR to switch the backend to it, do you have a pointer on what files I would need to modified?
We can work this out by using the __location__
from where the datadep macro is called I think then searching for a Project.toml. Though really i guess we want the location where register
was called. For hacks we could use the stacktrace()
and go find it?
No leave them alone. But we could put something in the release notes
It is a flaw in DataDeps that they need to be globally unique, since it means clashes are possible. Unlike with the smarter content hashing that Artifacts use. One thing we do need to preserve is the ability to access datadeps via a global name (though that is allowed to error when there is a clash I guess), since i am sure some people install a package and then access the datadeps it registers by name.
Maybe scratch spaces, as you say are the better way to do this.
It'd be really cool if DataDeps had a
gc
function which could identify old datadeps that aren't referenced/needed anymore and suggest deleting them. I'm not quite sure how it work, but it wouldn't need to be perfect. It could just make a best guess, ask the user, and they can always redownload things they actually need.