metadatacenter / cedar-submission-server

CEDAR server to handle submissions to metadata repositories
Other
0 stars 1 forks source link

Update release date handling for AIRR template #32

Closed martinjoconnor closed 5 years ago

martinjoconnor commented 5 years ago

Problem described in following email to Yuriy at NCBI (with suggested solution confirmed as appropriate by him):

Our understanding is that the pubic release of, say, an SRA submission entry will force the release of referenced BioSample submission entries, which will in turn force the public release of the reference BioProject. So, even if, for example, a BioProject submission has a future release date, the public release of a BioSample that references it will effectively make the BioProject public irrespective of its release date.

Is this understanding correct?

The reason this is an issue for us is that we are generating a BioProject/BioSample/SRA submission from the AIRR specification and it includes only a BioSample release date - and has no overall submission release date or SRA entry release date. Since the SRA parts of the submission have no release date we are assuming that they are released immediately - which forces release of the referenced BioSamples and in turn the BioProject, effectively making the entire submission public immediately.

We are assuming that we should include release dates for each BioProject, BioSample, and SRA entry to fully control the pubic release dates? 
martinjoconnor commented 5 years ago

See: https://github.com/airr-community/airr-standards/issues/143

graybeal commented 5 years ago

Based on initial reaction I suspect doing a single release date for everything makes most sense.

marcosmro commented 5 years ago

I have updated the submission server to work with the latest MiAIRR template and to use the top-level release date for everything.

I have started a test submission with a sample instance to check that everything works fine. However, I found a minor issue in the template that should be fixed before users start populating it: The top-level release date is a text field. I think that it should be a date field to avoid wrong date formats that will break the submission. If you agree, I will update it.

Apart from that, I found some model issues, probably because the elements were created some time ago following an old version of the model:

I can fix these issues too and create a new copy of the MiAIRR template. I will probably fix them first in the elements at the MongoDB level and then I will regenerate the template using the Template Editor. If there are any existing instances for the template, they will have to be updated.

graybeal commented 5 years ago

Summarizing the email I sent you:

I just looked at a few of the linked BioSamples and they are accessible publicly. Is that what we expect? The example submission looks very un-demo-like, actually. Begining reads:

This is an automatic acknowledgment that your recent submission to the BioSample database has been successfully processed and will be released on the date specified.

BioSample accessions:           SAMN02181721, SAMN02181722, SAMN02181723, SAMN02181724, SAMN02181725, SAMN02181726, SAMN02181727, SAMN02181728, SAMN02181729, SAMN02181730, ... see attached file.
Temporary SubmissionID: SUB424311
Release date:           2020-07-07-07:00, or with the release of linked data, whichever is first

An example link is https://www.ncbi.nlm.nih.gov/biosample/2181732 (and down through ...721). Which points to this project: https://www.ncbi.nlm.nih.gov/bioproject/205706

Is that what you submitted? Should I be able to see it?

marcosmro commented 5 years ago

@graybeal Thanks for your comments. The submission was targeted to NCBI's TEST folder, so it shouldn't be publicly accessible.

Regarding the message that you received, the submissionID and the release date correspond to my submission, but the Biosample accessions don't. The information from the links is not familiar to me either. That's not our BioProject. That biosample was submitted in 2013.