Closed Carreau closed 1 year ago
I don't this we need to implement that as part of our action – too keep it simple though as we are publishing everything on the same repo we could have a cron that cull all the old wheels from here.
@Carreau What's the scope of the action? Is it taking a page from https://github.com/pypa/gh-action-pypi-publish (in terms of scope) and only doing the upload? This is (old wheel removal) is part of my https://github.com/scientific-python/upload-nightly-action/issues/5#issuecomment-1564723037.
Removal is pretty straightforward though and you can avoid a cron job that might remove more then you care about by having someone do something similar to what I did for matplotlib
in https://github.com/matplotlib/matplotlib/pull/23349 (now expanded to the following by others):
- name: Remove old uploads to save space
run: |
N_LATEST_UPLOADS=5
# Remove all _but_ the last "${N_LATEST_UPLOADS}" package versions
# N.B.: `anaconda show` places the newest packages at the bottom of
# the output of the 'Versions' section and package versions are
# preceded with a ' + '.
anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+') | \
sed 's/.* + //' | \
head --lines "-${N_LATEST_UPLOADS}" > remove-package-versions.txt
while LANG=C IFS= read -r package_version ; do
echo "Removing scipy-wheels-nightly/matplotlib/${package_version}"
anaconda --token ${{ secrets.ANACONDA_ORG_UPLOAD_TOKEN }} remove \
--force \
"scipy-wheels-nightly/matplotlib/${package_version}"
done <remove-package-versions.txt
Thanks for the comment and sorry for the confusion.
Basically I don't want to cull each time a new nightly is published, I wan to keep the action to just publish new nightly and as simple as possible. Then we can centrally decide how we cull older wheels separately.
We had some of those discussion in person – like should the action both built the wheel and upload ? Should it also remove, and should the number of wheel kept is configurable. But the more things you try to do, and the more option you had, the more disagreement there is.
https://github.com/OpenAstronomy/publish-wheels-anaconda is another action that takes care of removal
I think doing the clean up centrally will be easier than asking each project that uploads wheels to also clean up (and pick sensible parameters).
+1 to centralise this. The action could be run from this repo I suppose.
If we do the cleanup ourselves, we have to consider the case were projects don't upload nightlies but weeklies or have another timeline. i.e. maybe keep at the last n versions.
i.e. maybe keep at the last n versions.
we also need a time cutoff, maybe a month or 2 months? (see the case of scikit-image in the old location where it was built once, but never again, and therefore was extremely outdated, tripping the CI of a few projects when they assumed to pick up the dev wheel)
Agreed, I would give 3 to 6 months to be more generous with "slower" projects. But we could start with a lower bar and revisit if the use case ever comes up.
I would argue that those "slower" projects may not be a good fit for these nightlies, and would not provide many benefits for nightly testers?
You could still want to have a way to propose a development version which is easy to pull vs making a release.
e.g. I am working on SALib and we have a very very low volume of commit which does not require frequent updates. Still, some of our users sometimes ask us to test our things before a release. Sure in this case the package is easily built enough, but you might have a more complex project or just add the mechanism as a way to show that your project is serious.
But as I said, we can also wait for folks to ask that and start as you proposed with something a bit more aggressive.
Fair enough, thanks for clarifying with an example.
Of course 😃 But now sleep, I should not play with the jet lag haha
I like the idea of keeping the last N wheels.
Situations like scikit-image sound like something we should notice and somehow address "human to human". For example, do they upload so rarely because the project rarely changes, because something broke, etc.
Given https://github.com/scientific-python/upload-nightly-action/pull/12#issuecomment-1570414245 is there agreement on 1-action-per-repository? If so, then if I've understood @tupui correctly this means we need a new remove-nightly-wheels
repo (or something of that sort).
Why not having a single action which go through all our packages and apply a fixed policy? I think that's what we were saying here no?
The only concern I have is that we need to be careful with this action and its token as it would be an admin token in the wild. So it's going to be a trade-off between security and conveniency for all packages.
Here I think it would be an acceptable risk to use an admin token for that purpose as it's to apply a global policy.
Should we do a small call to discuss these?
Why not having a single action which go through all our packages and apply a fixed policy? I think that's what we were saying here no?
Ah so "centralise" as in a "repo as a nighlty CI cron service for the whole org"? I have a few of those for dashboarding services for orgs.
The only benefit I see to having this be a service, compared to an action that projects run in their workflows, is that project maintainers no longer have to be checking that they are keeping things within their quota. Though perhaps the main advantage is that other projects in the org don't also have to ask for other maintainers to clean up space if things are running tight. I guess this is exactly what @betatim meant in
I think doing the clean up centrally will be easier than asking each project that uploads wheels to also clean up (and pick sensible parameters).
:+1:
Should we do a small call to discuss these?
If it is necessary, sure, though these security details seem reasonable given my understanding now and you can further restrict things to run in a specific GitHub Actions environment.
Yeah the idea is that projects should just care about uploading their package. The rest is admin/maintenance and we can do it for them ideally. So we are all on the same page (and it's fair as in the same rules for all) all the time.
Okay, then if people are in agreement, and someone else who is better at naming things than me chooses a repo name, I can make a PR to that new repo tonight US time that adds this as a GitHub Actions workflow. Should be easy to iterate on the PR to a version people are happy with quickly.
Personally I would be ok if this would run on the current action repo. It's related in scope so I would not find this out of place.
+1 on keeping it here
Personally I would be ok if this would run on the current action repo. It's related in scope so I would not find this out of place.
+1 on keeping it here
Done in a redone PR #12. c.f. https://github.com/scientific-python/upload-nightly-action/pull/12#pullrequestreview-1453816153 for details.
Another action that allows to upload to anaconda a is https://github.com/OpenAstronomy/publish-wheels-anaconda, in particular it allows to remove older wheels.
I don't this we need to implement that as part of our action – too keep it simple though as we are publishing everything on the same repo we could have a cron that cull all the old wheels from here.