Open charliepascoe opened 3 years ago
Hi Charlotte, Martin
I am working on an updated CCMI data access policy that users will see when clicking on preview conditions in a separate window
as they proceed to apply for access. This is one piece of the puzzle. I should have that in a day or two. I assume that this should be provided as a PDF?
The text of the data policy has been agreed with the CCMI SSC. There is a copy available at http://blogs.reading.ac.uk/ccmi/files/2021/08/CCMI-2022_data_policy.pdf. Can this PDF be made to appear as the conditions users implicitly agree to when applying for access to the CCMI-2022 archive?
@gap736uk @martinjuckes Hello Graham, Martin. My apologies for pestering, but I wanted to refresh this issue in your inbox. I have just tried to apply for access through the catalogue entry for the NIWA-UKCA data and it is still returning a 500 error. The data policy for access has also been completed and should be presented to people as part of the procedure to apply for access. If either of you have some ideas about how to fix the error and could link the data policy that would be greatly appreciated on my end. Thank you both.
I think this is best done with Charlotte's return to work on Monday as I don't feel in a place to be able to comment I'm afraid not knowing the full back ground of the data in the archive... the following are some comments that might be helpful, but if discussions have already looked into things please feel free to ignore ! :)
The issue with regards to the archive is related to the data licence and how that is used within the application system and what access control is in place.
In the CEDA archive there are three levels of access control:
This would be set in line with the data policy and the licence for the data.
Typically data under open licences go out under either the 'public' or 'registered user' access control mechanisms... this is the route used for data under the Creative Commons licences (or other equivalents such as the Open Government Licence for UK funded data)
However, when data are under the 'restricted' access group then we have an access control system in place through which users have to apply for access... this only works with local PDFs of the licence in question - and needs such a licence in order to work... I believe this is the cause of the 500 server error being encountered here.
From what I've read of the Data Policy file it sounds like the default is for the data to be available under the CC by licence ... eventually...... but during Phase I there is this extra clause:
The phase 1 policy allows for the free and open use of model output but includes the obligation to offer co-authorship to model PIs on any manuscripts that are submitted for publication during this time. In return, it is expected that model PIs will constructively participate in the analysis and interpretation of the results
Thus, I can see why the data are presently under the access control in the archive, but there's no licence file associated with this in the system at the moment which is causing the issues.
Is there a licence that has already been agreed (these are best a a specific document not the Data Policy)? Or would something like our generic Restricted Use General Licence (RUGL) work - this was derived from the OGL but removes the 'onward sharing' of data aspect in particular that is permitted in open licences such as CC-By and OGL.. this licence was also put together to be used for data that pass through an embargoed period :
https://artefacts.ceda.ac.uk/licences/rugl_versions/rugl_v1-0.pdf ?
You'll note that this licence specifically has the following clauses:
You must (where you do any of the above): ● SEEK PERMISSION FROM THE INFORMATION PROVIDERS for the information or derived products to be used within any publication, presentation or service prior to the investigators' own publication of that work. ● OFFER JOINT AUTHORSHIP to the information providers for any related publication, presentation or service utilising the information. ● ACKNOWLEDGE THE SOURCE of the Information in your product or application by including or linking to any attribution statement specified by the Information Provider(s) and, where possible, provide a link to this licence.
With Charlotte being back on Monday, though, if we're able to wait until then we can get her input and guideance on this - as whilst I know the mechanisms and workflows to get things working here there's an upstream decision needed here with regards to access control, where it applies in the archive and the licencing that needs to be used.
Thank you very Graham. From what you say, it sounds like the cause of the problem and what needs to be in place to get things to work are pretty well known. I have some stuff to think about and I am happy to wait until Charlotte gets back next week. Thanks again for your help.
I think that the generic Restricted Use General Licence is a good match to the ccmi data policy. I can add a documentation link to the ccmi data policy in the catalogue records and make an explicit reference to the ccmi-2022 citation statement in the abstracts for each of the dataset records.
The protocol to access ccmi-2022 should all be set up now, but we should probably test that it really is working before sending everyone instructions for gaining access.
To request access people first need to register as users at ceda. To begin the ccmi-2022 application process they can either hit "apply for access" on one of the published ccmi dataset records (e.g. https://catalogue.ceda.ac.uk/uuid/9d93bed3b24648fcade5e427903c7da7) or visit https://services.ceda.ac.uk/cedasite/resreg/application?attributeid=ccmi-2022 where they will be asked to provide details of their intended use of the data. The application form requires people to provide sufficient information for their application to be assessed. Once this is submitted they will see a copy of our general restricted use license, on this second page (the one with the license) they will need to scroll down to the bottom and hit agree to activate the request.
When the ccmi-2022 moves from phase 1 (embargo) to become open access, CEDA will update the license to cc-by.
I agree the Restricted Use General Licence is a good fit for the 'embargo' period of the CCMI-2022 archive. Is it possible to also direct users to the CCMI-2022 data policy from the Dataset and Services Registration page that presents the RUGL? If it is not a simple thing to do, I will just direct people to the CCMI data policy when I send out an e-mail announcing the opening of the archive. Note that the default licence written into the files is cc by-sa. I am not sure what particular licence this would correspond to once the phase 1 period expires?
Our present dataset application system can be adopted slightly to inject a leading page ahead of the main application workflow to provide a little bit of information.... however, it is worth noting that this is done rarely and we're also in the process of preparing a replacement system. At this point I'm not sure if there is scope for such bespoke pages to appear in the new application workflow under preparation, so I can't say how long such a page might be part of the application pipeline (this is something I need to check anyway).
Regarding the data files and the cc-by-sa licence that is included this is the Creative Commons By Attribution Share Alike licence - version 2 of that licence can be seen here: https://creativecommons.org/licenses/by-sa/2.0/
This basically states:
i.e. it's very similar to the CC-By licence but does add that extra requirement on all resulting content/data/products to also be under cc-by-sa, which the CC-by licence doesn't mandate.
We can set things up to roll over to using that licence once the Phase 1 period expires.
I have provisionally published the ref-D1 data provided by the NIWA team under Olaf Morgenstern. https://catalogue.ceda.ac.uk/uuid/9d93bed3b24648fcade5e427903c7da7 I did this to activate the apply for access procedure for ccmi-2022.
If a person applies to access to the NIWA REF-D1 data they should be added to the ccmi-2022 unix group and therefore have access to the complete ccmi-2022 archive.
However when I tried doing this myself I got a 500 error. @gap736uk do you know why this may be? @martinjuckes can you help?
It would be great if this could be resolved soon, it feels like this is the last piece in the jigsaw to getting ccmi-2022 up and running. Unfortunately I'm going to be away for the next two weeks.