cf-convention / vocabularies

Issues and source files for CF controlled vocabularies
3 stars 1 forks source link

Automating the synchronisation of "accepted" status between github and the standard name editor #180

Closed martinjuckes closed 3 months ago

martinjuckes commented 1 year ago

@japamment : It has been suggested that the workflow of standard name approvals could be improved by automating the switch to accepted status when a github standard name issue is accepted. We think someone at CEDA can do this. I'm creating this issue to rack the task.

feggleton commented 1 year ago

That would be brilliant!

japamment commented 1 year ago

Hi @martinjuckes @feggleton

(also tagging @JonathanGregory). Thanks for the suggestion but I'm afraid I disagree with this idea.

For simple issues with one or two names this could work, but the problem comes when we have standard name issues containing requests for many names.

We encourage proposers to group related names into a single issue rather than opening a separate issue for each one (this allows for a reasonably efficient discussion process and avoids the unnecessary proliferation of open issues). However, when there are many names in one issue they will not all necessarily be agreed at the same time.

Our usual workflow is that, as discussion progresses and some of the names are agreed, we go ahead and mark those individual names as 'accepted' in the standard names editor and they are then automatically included in the next update to the standard name table. Discussion of the more complex/contentious names is able to continue in the GitHub issue until consensus is achieved and then they too can be accepted and published. This avoids delaying the publication of all the names in an issue until every single one has been agreed and accepted.

The proposed idea would not really save any time and might potentially cause names to be published erroneously for the following reasons.

  1. If we wait until all names in an issue are agreed before marking it as accepted in github (and then automatically accept them in the editor) this will delay publication of names that are agreed early in the discussion;
  2. If we mark an issue as accepted in github (and then automatically accept all the associated names in the editor) before all names are fully agreed, we may accidentally publish some of the names too soon. This would affect not only the standard name table on the CF website (which would then take extra work to fix) but also the NERC Vocabulary Server from which there is no way to delete incorrect entries (we can deprecate them, but we can't completely hide them).

In summary, I don't think there is any great advantage to be gained by automating what is effectively one button push in the editor, and unpicking any mistakes resulting from too much automation is likely to cause far more work than it saves.

Apologies if all that sounds very negative, but I really don't think we should do this.

Alison

JonathanGregory commented 1 year ago

Dear Alison @japamment

Thanks for your considered comments. There's a related question, which is part of the motivation for this suggestion (I believe), namely how can we enable more members of the CF community on GitHub to bring standard name proposals to a conclusion. With conventions proposals, many people can do this, right up to the final stage of modifying the standard document after agreement has been reached. However, the involvement of the standard name editor at CEDA in the adoption of standard names means that it depends on CEDA staff (especially you) or others who have been given access (such as Fran), if my understanding is correct. Only those people can push the button which you refer to. I wonder what your thoughts are about this.

Best wishes

Jonathan

japamment commented 1 year ago

Dear Jonathan @jonathangregory (tagging @feggleton)

You are correct that tracking and publishing standard names does require access to the CEDA editor which, as you know, was developed specifically to support the standard names workflow and in particular the dual publication process on the CF website and the NERC Vocabulary Server (NVS2).

Currently, Fran and I are the only two people with admin access to the editor (it is readable by everyone) and initially we were both doing this from within CEDA. It was part of the editor's original design that people working in other organisations could be give admin access, subject to appropriate security credentials being created, and Fran's move to the Met Office has allowed us to test the use of that feature for the first time. It appears to be working well as regards keeping the editor content up to date and the logical next step would be to test whether Fran can now publish a full update to the standard name table (this requires push access to the CF github.io repo, a user account on NVS2 and additional NVS permissions to edit the CF vocabularies on that platform). Assuming that Fran can publish an update, this will immediately double the number of people who can perform that task :-) . In the longer term this opens a route to allowing others, working in organisations besides CEDA and the Met Office, to be given admin access to edit and publish standard name table updates.

The editor removes many technological bottlenecks to processing the standard names, and the use of additional labels and github actions (developed by Fran) is now helping us to keep the two better synchronized. Over time we may further refine these processes, but what we need now is additional human resource to help keep the discussion issues on track. To address this I propose to take the following actions:

  1. Take the existing documentation of the vocab editor (which is currently an internal CEDA document) and turn it into something that can be shared more widely. The document as it stands covers both the process of using the editor and some of the process for moderating standard names proposals, respectively. It needs updating and would be better split into two separate documents. They probably shouldn't be turned into pages on the website, but I think it would be okay to store them somewhere in the CF repos so they are accessible by the CF Information Management and Support team.
  2. I will write a "job description" and "candidate profile" for someone to help look after the standard names (alongside Fran and myself). We can then ask among the community for an additional volunteer to help moderate the discussions and keep the editor up to date. Ideally this would be someone who could give approximately one day per week to the task (with the agreement of their institution of course) and might be a good career step for someone who needs/wants to be involved with the CF community.

Essentially the idea is to build a small "CF standard names team" who can work together to keep the processes moving. I'm happy to discuss this further with the conventions/standard names committee and I hope this will offer a way to improve the robustness of the standard names process and put it on a firm footing for the future.

Best wishes

Alison

JonathanGregory commented 1 year ago

Dear Alison @japamment

Thanks for your useful initiatives to make documentation available for the vocab editor, the process of moderating standard names, and the person specification for standard name moderators. I agree with you that we need more people who can participate, and these documents are essential to that end.

While it would be magnificent if someone volunteered to spend a day per week on it, I think that others with less time to commit should also be able to contribute effort in this way, especially members of the CF committee or others who frequently participate in standard name discussions. Would it be OK to allow several people write-access to the vocab editor?

You, Fran @feggleton and I have previously exchanged emails about the rules and timescales for standard name proposals. Fran's automation is a valuable recent development that contributes to the process. Would it be possible also to produce a document like the rules for conventions changes for standard names, that we could put on the website?

Best wishes and thanks

Jonathan