anaconda / anaconda-client

Anaconda Server Client
https://anaconda.org
BSD 3-Clause "New" or "Revised" License
146 stars 240 forks source link

Possible to get upload time metadata from anaconda-client? #682

Open matthewfeickert opened 1 year ago

matthewfeickert commented 1 year ago

Hi. :wave: In https://github.com/scientific-python/upload-nightly-action/issues/31 we're interested in being able to selectively remove packages (on a channel for nightly wheel uploads) if they cross a threshold of time since they have been uploaded without being cleaned up by any of our other checks. At the moment I don't think this is possible with anaconda-client as anaconda show gives successive information at the USER[/PACKAGE[/VERSION[/FILE]]] level

(base) mambauser@39161fd29268:/tmp$ micromamba list anaconda-client
List of packages in environment: "/opt/conda"

  Name             Version  Build         Channel    
───────────────────────────────────────────────────────
  anaconda-client  1.12.0   pyhd8ed1ab_1  conda-forge
(base) mambauser@39161fd29268:/tmp$ anaconda show --help
usage: anaconda show [-h] spec

Show information about an object

positional arguments:
  spec        Package written as USER[/PACKAGE[/VERSION[/FILE]]]

options:
  -h, --help  show this help message and exit

Show information about an object

Examples:

    anaconda show continuumio
    anaconda show continuumio/python
    anaconda show continuumio/python/2.7.5
    anaconda show sean/meta/1.2.0/meta.tar.gz
(base) mambauser@39161fd29268:/tmp$ anaconda show "scientific-python-nightly-wheels/ipython/8.13.2"
Using Anaconda API: https://api.anaconda.org
version 8.13.2
   + ipython-8.13.2-py3-none-any.whl
None
(base) mambauser@39161fd29268:/tmp$ 

Can one easily get upload datetime metadata for packages on Anaconda Cloud with an existing API? If so, would it be possible to have this metadata be accessible through anconda-client's CLI API? Or is this out of scope?

csoja commented 1 year ago

Yes, this is possible. There is an existing API command that gets information about a file - that includes upload time. Are you able to use the anaconda.org api, or does it have to be in anaconda-client? The Anaconda team is heads down on a few high priority initiatives right now - but we can look at adding this to anaconda-client when the team frees up a bit.

matthewfeickert commented 1 year ago

There is an existing API command that gets information about a file - that includes upload time.

Great!

Are you able to use the anaconda.org api, or does it have to be in anaconda-client?

We don't care what library we need to use, so we could use the anaconda.org API. Can you tell us which API this is?

yshmatov-anaconda commented 1 year ago

Hello @matthewfeickert !

You can use anaconda-client as a python package to access the API, or send HTTP requests directly. I'll provide examples for both.

If you want to use anaconda-client as a python package, start python in the same environment where anaconda-client is installed and try next commands:

import binstar_client.utils

# create API instance
api = binstar_client.utils.get_server_api()

# ==============================================================================

# fetch details about a package
package = api.package('scientific-python-nightly-wheels', 'ipython')

# when the package was created
print(package['created_at'])

# when the package was updated
print(package['modified_at'])

# list all releases of the package (including empty ones)
# details beyond name are not available here, but you can use commands below to gather them
print([rel['version'] for rel in package['releases']])

# ==============================================================================

# fetch details about a release
release = api.release('scientific-python-nightly-wheels', 'ipython', '8.13.2')

# when a file was uploaded last time to it
print(max(dist['upload_time'] for dist in release['distributions']))

# ==============================================================================

# fetch details about a distribution (i.e. file)
distribution = api.distribution('scientific-python-nightly-wheels', 'ipython', '8.13.2', 'ipython-8.13.2-py3-none-any.whl')

# when the file was uploaded
print(distribution['upload_time'])

Same commands using direct HTTP requests with CURL:

curl -s https://api.anaconda.org/package/scientific-python-nightly-wheels/ipython | jq -r .created_at
curl -s https://api.anaconda.org/package/scientific-python-nightly-wheels/ipython | jq -r .modified_at
curl -s https://api.anaconda.org/package/scientific-python-nightly-wheels/ipython | jq -r .releases[].version

curl -s https://api.anaconda.org/release/scientific-python-nightly-wheels/ipython/8.13.2 | jq -r .distributions[].upload_time | sort | tail -n 1

curl -s https://api.anaconda.org/dist/scientific-python-nightly-wheels/ipython/8.13.2/ipython-8.13.2-py3-none-any.whl | jq -r .upload_time

Just change organization/package/release/distribution names to the ones you are interested in. Please let me know if you have any questions.

matthewfeickert commented 1 year ago

Thanks! For now just understanding the JSON is enough as

$ curl -s https://api.anaconda.org/package/scientific-python-nightly-wheels/ipython | jq -r .releases[].version > package-releases.txt
$ for release in $(cat package-releases.txt); do curl -s https://api.anaconda.org/release/scientific-python-nightly-wheels/ipython/"${release}" | jq -r .distributions[].upload_time | sort | tail -n 1 | awk '{print $1}' ; done
2023-05-26
2023-05-28
2023-08-13

gets us basically all we need. Though the binstar_client example is useful too if we decide to move away from using the anaconda CLI API in the future for something that is fully scripted with Python and binstar_client.

matthewfeickert commented 1 year ago

@csoja @yshmatov-anaconda How stable is the anaconda.org API? It seems that .distributions doesn't exist for some of the package versions that have been uploaded in the past.

$ curl -s https://api.anaconda.org/package/scientific-python-nightly-wheels/numpy | jq -r '.releases[].version'
1.25.0rc1+93.g95343a3e6
1.25.0rc1+104.g19f86c318
1.25.0rc1+140.g2a243e698
1.25.0rc1+218.g0e5a362fd
2.0.0.dev0
$ curl -s https://api.anaconda.org/release/scientific-python-nightly-wheels/numpy/1.25.0rc1+93.g95343a3e6 | jq -r '.distributions[].upload_time'
$ curl -s https://api.anaconda.org/release/scientific-python-nightly-wheels/numpy/1.25.0rc1+93.g95343a3e6 | jq -r '.distributions'
[]
$ curl -s https://api.anaconda.org/release/scientific-python-nightly-wheels/numpy/2.0.0.dev0 | jq -r '.distributions[].upload_time' | sort | tail -n 1 | awk '{print $1}'
2023-08-13

Is this API something we can expect to rely on into the future?

csoja commented 1 year ago

afaik we aren't planning big changes to the API in the future - I believe it will be treated as a bug that needs to be fixed if it stops working. In other words, please report any problems you run into with it.

matthewfeickert commented 1 year ago

Thanks @csoja.

curl -s https://api.anaconda.org/release/scientific-python-nightly-wheels/ipython/8.13.2 | jq -r '.distributions[].upload_time' | sort | tail -n 1

seems to be doing well enough for what this Issue was originally opened for, so I think our use case is resolved.

The Anaconda team is heads down on a few high priority initiatives right now - but we can look at adding this to anaconda-client when the team frees up a bit.

If you think this is something that the Anaconda team will realistically look at adding to the anaconda-client anaconda CLI API then we can leave this Issue open, but otherwise feel free to close this.