JuliaPy / pyjuliapkg

Manage your Julia dependencies from Python
MIT License
44 stars 12 forks source link

Lightweight check for julia dependencies #20

Open amilsted opened 5 months ago

amilsted commented 5 months ago

Currently, at least when we are not in offline mode, we Pkg.add all dependencies on every resolve. This is rather slow and can trigger unrequested package updates to indirect dependencies. It's also unnecessary when the required packages are already installed.

Fortunately, once can use Pkg.dependencies() to check versions without doing any add or resolve. For example:

def check_julia_deps():
    import juliapkg
    from juliacall import Main as jl
    from juliacall import Pkg

    avail_jl = Pkg.dependencies()
    for p in juliapkg.deps.required_packages():
        logger.debug(f"Checking pkg '{p.name}'.")
        try:
            dep_info = avail_jl[jl.Base.UUID(p.uuid)]
        except KeyError:
            logger.debug(f"Pkg '{p.name}' not found in Julia Manifest.")
            return False
        if not dep_info.is_direct_dep:
            logger.debug(f"Pkg '{p.name}' not found in Julia Project.")
            return False
        if p.dev:
            if dep_info.is_tracking_path:
                continue
            else:
                logger.debug(f"Pkg '{p.name}' should be dev'd, but isn't.")
                return False
        # NOTE: Julia Pkg.Versions.semver_spec may not be stable API
        if p.version and dep_info.version not in Pkg.Versions.semver_spec(p.version):
            logger.debug(f"Pkg '{p.name}' at {dep_info.version} not compatible with '{p.version}'.")
            return False
    return True

This does not cover all possible cases, I think, but could be extended. Should something like this be added to pyjuliapkg?

amilsted commented 5 months ago

This could be done in online mode to skip unnecessary actions. It can also be done in offline mode so that we can fail properly if required versions aren't present.

amilsted commented 2 weeks ago

@cjdoris Looking at it again, I guess juliacall might be a more sensible place for this.

cjdoris commented 2 weeks ago

Currently, at least when we are not in offline mode, we Pkg.add all dependencies on every resolve. This is rather slow and can trigger unrequested package updates to indirect dependencies. It's also unnecessary when the required packages are already installed.

Could you clarify the issue this solves? It sounds like you are seeing juliapkg install the dependencies every time you start a new Python process that loads juliapkg? This isn't supposed to happen, juliapkg is only supposed to install dependencies if there was actually a change in the required packages. It does this by comparing the mtime on the meta file with a bunch of other important files, and if it is newer, then nothing has changed.

If you are observing frequent reinstalls, this means something is going wrong with the mtime check, which should be investigated separately.

amilsted commented 2 weeks ago

Huh, yeah - I see what you mean. I will test this again and see where in can_skip_resolve I am having problems (if there are still problems).

amilsted commented 3 days ago

So, there are times when the mtime changes but the contents do not. Do you think it's reasonable to move to hashes for the deps files?

cjdoris commented 1 day ago

Yep using hashes instead of mtime has been proposed before but never implemented. Sounds like a good idea provided it's not a big performance regression. I'd be happy to take a PR for it.