iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.79k stars 603 forks source link

Improved package index for IREE pre-releases #9532

Open julianwa opened 2 years ago

julianwa commented 2 years ago

Request from @powderluv. Goal is to simplify pip install chains that want to depend on the nightlies.

[Edit: original issue title was "Create a separate PyPI for IREE nightly"]

allieculp commented 2 years ago

Jacques & Geoffrey, please help to assign a priority to this item.

jpienaar commented 2 years ago

Assigned priority from what I saw.

Also was reading https://github.com/pypa/warehouse/issues/8792 , the pre-release "channel" (contrasted to nightly) is interesting, it would remove extra -f path for a -pre but at effect of using prelease for all.

GMNGeoffrey commented 2 years ago

@powderluv Can we get more detail on what exactly this is trying to solve? What is the current thing you're trying to do and what is painful about how you have to do it?

powderluv commented 2 years ago

Today an end user installing SHARK does

pip install nodai-shark -f https://github.com/nod-ai/SHARK/releases -f https://github.com/llvm/torch-mlir/releases -f https://github.com/nod-ai/shark-runtime/releases --extra-index-url https://download.pytorch.org/whl/nightly/cpu

and the same user using IREE would do

pip install nodai-shark -f https://github.com/nod-ai/SHARK/releases -f https://github.com/llvm/torch-mlir/releases -f https://github.com/google/iree/releases --extra-index-url https://download.pytorch.org/whl/nightly/cpu

I am working with torch-mlir upstream to drop the torch-mlir -f and the pytorch -f. I want to drop the IREE -f too and then eventually drop the shark -f

So to get a batteries included install of SHARK+IREE+torch-mlir+Pytorch it is complicated. We want to get to pip install nodai-shark or pip install iree-torch or pip install iree-jax

powderluv commented 2 years ago

Here is an example for how -pre is used for the nightly and doesn't require a -f URL

https://github.com/openai/triton

GMNGeoffrey commented 2 years ago

With https://github.com/iree-org/iree/issues/10479, we're now hosting our own page to even make the current --find-links work. torch-mlir is doing the same and now we're going to have divergence and more complication for users. I think we should consider how to best do this, not necessarily anchoring on PyPI pre.

Options:

  1. Current approach

    pip install --find-links https://iree-org.github.io/iree/pip-release-links.html iree

    Advantages:

    • Already done
    • Simple

    Disadvantages:

    • users have to specify a special flag to target the index
    • relies on pip's find-links functionality, which I believe is non-standard
    • the specific flag users have to specify is non-standard and can be different across projects.
    • Current implementation is kind of just wedged into our docs page, though we could host it elsewhere with the same basic approach.
  2. Create our own PEP 503 compliant package index with just our releases. This could use something like https://github.com/astariul/github-hosted-PyPI to host an IREE index as github pages in a dedicated repository.

    pip install --extra-index-url https://iree-org.github.io/iree/package-index iree

    (exact URL TBD)

    Advantages:

    • standard well-lit path
    • We may just be able to use the above tool out of the box.
    • More easily scales to hosting other packages (e.g. iree-jax)

      Disadvantages:

    • users still have to include a special flag to target the index
    • we have to host a thing involving multiple html pages
  3. Host pre-releases on PyPI under the same package name and compatible with the pep --pre flag to access them. See https://peps.python.org/pep-0440/#handling-of-pre-releases for how pre-release are indicated.

    pip install --pre iree

    Advantages:

    • Single simple flag for users
    • Hosted in the same place as our real releases

    Disadvantages:

    • There are concerns about project size limits on PyPI. We would likely quickly run into these if we also upload pre-releases, forcing us to delete old releases more frequently than we'd like.
    • Maybe having pre-releases right next to real ones could confuse people?
    • Maybe having pre-releases more visibly published would cause people to install them even if they weren't ready to deal with the instability.
  4. Use a separate package on PyPI, similar to how TensorFlow uses tf-nightly. Users need to change the package name, rather than

    pip install iree-nightly

    Advantages:

    • Changes needed by users don't require guessing URLs.
    • Hosted in the same system as our real releases
    • Any size concerns at least don't affect the real releases
    • Users can decide which packages they want to use pre-release versions for

    Disadvantages:

    • We still have issues with size limits.
    • Writing a different package name is arguably more annoying than passing a different flag.
    • Changes have to be made for each package, not just
    • Maybe having pre-releases more visibly published would cause people to install them even if they weren't ready to deal with the instability.
    • Anecdotally, people don't like this setup in TF.
    • People can already specify which versions they want of a package, so at the point they're making distinctions between different packages, maybe the ability to choose pre-release for only some packages isn't very useful.
  5. Host our own entire index using something like devpi.

    pip install --index-url https://iree-org.github.io/iree/package-index

    Advantages:

    • Complete control over the index
    • Index also can be used as a caching layer
    • Uses well-known tool

    Disadvantages:

    • We have to host our whole own serving infra
    • Hosting an externally-accessible server probably means we have to get a security review and they may object to devpi (who knows).
    • Users still have to specify an extra index flag

A note about package indexes: there is no ordering between indexes. Regardless of how each index is specified, they all get combined and searched together, assuming that there is a unique package per name and version. So we can't use one index to "override" another. I think that's ok because when people point to the custom index they're just adding addiitonal unstable versions that may have higher version numbers, but it came up as an issue several people ran into when trying to use their own indexes.

Pip can be configured to add extra indexes automatically, but it's per-user or per-environment configuration: https://pip.pypa.io/en/stable/topics/configuration/.

GMNGeoffrey commented 2 years ago

Raising the priority on this accordingly

GMNGeoffrey commented 2 years ago

Looks like we can set the pip config for a given virtual env, so adding these options could be something done once. Even better, it looks like we can configure things from a requirements file: https://pip.pypa.io/en/stable/reference/requirements-file-format/#supported-options. So we could actually just provide a requirements file that abstracts this away from the user

jpienaar commented 2 years ago

Beat me to it, was going to ask a out requirements file, seems like that addresses user needing to specify it (and seems commonly used). I think both combined works well, the config one for devs, the requirements file for more end users. The pre file should also work with our own index file as long as our naming convention is as expected if I understood correctly.

GMNGeoffrey commented 1 year ago

Unassigning myself from issues that I'm not actively working on

redthing1 commented 1 year ago

Where is iree-torch?

jpienaar commented 1 year ago

It is/was an integration point for IREE & torch-mlir where one could get a more user friendly checkout experience. We paused/deprioritized it a couple of months ago as we switched to ensure some level of coverage for Jax. Community side, https://github.com/nod-ai/SHARK-Turbine may be the more natural successor for it and it would be refocused to some niche cases.