scientific-python / specs

Scientific Python Ecosystem Coordination (SPEC) documents
https://scientific-python.org/specs/
BSD 3-Clause "New" or "Revised" License
59 stars 45 forks source link

"SPEC 4: uploading periodic builds" needs some tweaks. #223

Closed mattip closed 1 year ago

mattip commented 1 year ago

Over at https://github.com/scientific-python/upload-nightly-action/issues/14 there were some reservations about how the SPEC is being implemented in practice and some additional questions:

tupui commented 1 year ago

Thanks @mattip . This is still in the work, here are some clarification.

Admins

Admins are people from the community who are ideally not part of the same organizations nor project. This is to prevent malicious activities from a given group of actors and ensure a diverse and healthy community. Adding new people to the list of admins requires at least an issue to be openned.

This part is really about the admins of the Scientific Python org itself. We are working towards cleaning up the list and once it's done there should just be 3-4 people at most. The number is just to account for bus factor and ensure there is always someone who can help out in case something happen. And yes, this list of admins will be diverse.

These admins can create tokens which can do anything on all projects. This is what we had with the previous org. This is not what we want and why we are restricting the number of folks in this group. There is going to be a very limited of token with this scope, for now mainly one that will be used to cleanup the wheels (we only keep the last 5 revisions for now.) The other use case is when we want to add a new package, we need an admin token to do the initial upload, this token is immediately revoked after the operation.

Next, per package you have a group. Like NumPy, SciPy, etc. Each group has admins who only have access to this group (and ideally you would also select a diverse set of folks). Anaconda does not allow to create a "group" token, so individuals from a group have to create a personal token. With this token, you can update any package which is linked to the group.

In summary, with the tools we have from Anaconda, we will have a more secure setup than we currently have as tokens will only have access to a single package.

Transparency

The process was not followed because it did not existed when we had already started to migrate projects over to the new org. And the process is still being worked out. There is no malice here πŸ˜ƒ A lot of things happen very quickly during the summit because a lot of maintainers with actual decision power were present. Although in the end, it's not really the responsibility of Scientific Python to know if maintainers of a specific projects agree with some of their maintainers doing things here.

As to know who is admin, Anaconda does not provide an easy view of who is admin of what. I am not sure if this is our responsibility as well to provide this information. From a security standpoint, it could be argued that telling who is admin of what is a security concern as these folks would be targeted. Remember, Anaconda does not have 2FA... I don't really want to be in the news for a supply chain attack πŸ˜…

For the admin of the org itself, I would personally advocate for sharing an email address (I think we already have one for that even, @jarrodmillman?) and also ask people to use the issue tracker.

Concerning each projects, I know we cannot really "enforce" anything, but I would also advise some discretion.

What needs to happen in public is the discussion about including a project or not and the maintainers of this project to show that this was approved on their side. Then the admin team of the org need to know from the project who should be added to their group on Anaconda. I personally would like this to be privately communicated.

mattip commented 1 year ago

The process was not followed because it did not existed when we had already started to migrate projects over to the new org. And the process is still being worked out. There is no malice here

I do appreciate all that is being done to improve the ecosystem and having the summit was obviously a great way to move things forward. I apologize if my tone is too harsh. It is fine if the team wants to keep the names of the admins private, and the new situation is better than the old in that there is a clear process to add projects to the site.

This part is really about the admins of the Scientific Python org itself.

Ahh, cool. I missed that. I see a very healthy list of community leaders on https://scientific-python.org/about/, perhaps the SPEC should defer to that group to choose the admins.

tupui commented 1 year ago

No worries I know you mean good πŸ˜ƒ

Ahh, cool. I missed that. I see a very healthy list of community leaders on scientific-python.org/about, perhaps the SPEC should defer to that group to choose the admins.

That's a good idea πŸ‘ We wanted to plan some meeting with the SPEC committee, it would be good to put this on the agenda.

@stefanv @jarrodmillman @bsipocz did we plan something in the end? I know we discussed the possibility of a monthly or something.

jarrodmillman commented 1 year ago

We will have one meeting before the SciPy conference. We will send out a whenisgood type poll soon.

jarrodmillman commented 1 year ago

@mattip Thanks for keeping us honest. Just to be clear we don't intend to keep things private. We are just getting things set up. Sorry we haven't done a better job, but we will improve. If you could give us a little patience (after all we are all friends and have hopefully gained a little trust from one another), it would be great. Feel free to suggest language or make PRs. If you want to contribute you are more than welcome to join the SPEC committee.

We could have a top-down approach like you suggest where the community leaders decide who has access to the admin list, but I would prefer just letting people step up and volunteer. We are more than happy to make it public and will. Maybe I am unaware of where the old system was documented, but I think we are already more transparent than what was previously in place.

Our goal is to improve things and to continuously get better and more transparent.

jarrodmillman commented 1 year ago

We also didn't follow the process for adding new projects. We weren't just in the room, we were trying to put a better system in place for the projects using the old system. It seems like a bad idea to not include the existing projects. But we also wanted to have a clear process if new projects wanted to be added.

Do you know what the original process was? If a better system was previously in place, we should continue following it. But my understanding was there was no process. My thought was putting a process in place and trying to improve it was a good start.

stefanv commented 1 year ago

My sense was that Matti was just asking for some clarity on wording, and to see how the process would work in practice. @tupui gave a great response, and we must make sure the gist of that gets captured in the SPEC.

Logistics: @tupui if we need an admin email address, I can create one scientific-python.org that forwards to any other email addresses.

tupui commented 1 year ago

If there is no rush, I would say we wait for the meeting so we have a chance to no redo things πŸ˜ƒ

mattip commented 1 year ago

This is still in the work

Hmm, according to the SPEC process now that two projects have endorsed the SPEC modifications are more difficult to make.

From a security standpoint, it could be argued that telling who is admin of what is a security concern as these folks would be targeted.

The current workflow makes it trivial to see who the upload token belongs to: go to the files tab, you can see which anaconda.org user did the upload. The old site did not expose a user name, nor does conda-forge

tupui commented 1 year ago

Hmm, according to the SPEC process now that two projects have endorsed the SPEC modifications are more difficult to make.

Endorsing a SPEC only means that you agree with he general principle. It does not force a project to do anything. This is also what the meta SPEC is trying to convey.

These documents are guidelines. e.g. you could very well say you endorse the SPEC about deprecation although you already have a NEP which has a different timeline. This is perfectly fine.

A real life parallel is with the EU: directives vs regulations. SPECs are more like directives. It's up to the project to see how they implement an idea.

All that to say that we can still change things without making a huge problem. Also, making a short meeting with folks can solve misunderstanding quite effectively I think.

The current workflow makes it trivial to see who the upload token belongs to: go to the files tab, you can see which anaconda.org user did the upload. The old site did not expose a user name, nor does conda-forge

Yes, but it's one thing to have that there and to advertise more widely who has all the keys to our digital kingdom. And yes I totally disagree that Anaconda is doing that, it goes agains best practices. But they don't even do 2FA so...

I mean, if a bad actor would want to do something (so many good reasons to attack us since we are deployed on so many production systems), the first thing they would do is to look at who is critical and has all the keys. This way they can do a targeted attack. This is not a crazy scenario, it happens every day to high profile folks (e.g. look at how insane the attacks on LastPass were.) We are really just lucky at the moment, but it won't last. We already saw some attack last year on PyTorch, some other name squatting targeting some specific orgs and there are more and more folks talking about attacks on media, etc. This is drawing attention to the bad actors and they are just going to realise how critical we are.

Long story short, we have an edge here and should prepare until it's just too late and we make any headline.

mattip commented 1 year ago

All that to say that we can still change things without making a huge problem.

OK. I guess I misunderstood this bit in the SPEC process: "A SPEC is recommended for wide-spread adoption once it is endorsed by two (or more) Core Projects. Once a SPEC is recommended, further changes require the approval of all endorsing Core Projects."

stefanv commented 1 year ago

Once a SPEC is recommended, further changes require the approval of all endorsing Core Projects.

In the strictest sense, but I think we also have to use sound judgement over when a change clarifies intent vs when it is contentious or fundamentally alters the meaning of the SPEC.

mattip commented 1 year ago

Maybe "Once a SPEC is recommended, non-cosmetic changes require ..."

stefanv commented 1 year ago

https://github.com/scientific-python/specs/pull/226

tupui commented 1 year ago

With the latest edits to the SPEC4, does that close this issue or is there something that we would like to discuss further??

mattip commented 1 year ago

Thanks for the edits.