TGAC / brassica

Brassica Information Portal
GNU General Public License v3.0
6 stars 4 forks source link

Set of data to be deposited to Zenodo for a given submission #459

Closed Nuanda closed 8 years ago

Nuanda commented 8 years ago

For PlantPopulation submissions the following records are going to be uploaded to Zenodo:

For PlantTrial submission:

Currently the plan is to store them in separate CSV files (one per DB table) and send them in a single Zenodo deposition.

Is this data set correct?

teatree1212 commented 8 years ago

Plant population submission content looks correct. PlantTrial submission would probably need modification based on the minimum requirements I added in #488

Nuanda commented 8 years ago

Right.

teatree1212 commented 8 years ago

@Nuanda For plant trial submission to Zenodo, base the .csv file on the .csv in the 2.5 submission step and add PlantTrial table information to it. Hopefully the 2.5 step is quickly resolved after #488.

teatree1212 commented 8 years ago

I have updated the BIP email adresse for the Zenodo account. It is now changed from tgac.ac.uk to earlham.ac.uk

teatree1212 commented 8 years ago

@Nuanda

Add "plant varieties" to the PlantTrial list.But basically, the template we let the user design and submit in is a good template for submission to Zenodo.

In addition to that, certain metadata is required: I sent you an email (Monday, 18 July 2016 at 10:22) about a DOI template I made for the upcoming submissions, where I researched the required fields for metadata for a DOI submission. As Zenodo and CyVerse use both the DataCite standards, it should be similar. Can you give me information on how far you have come with this development and whether you need anything from me still?

Nuanda commented 8 years ago

Plant Varieties (or Plant Lines) will be published in the Plant Accessions CSV.

About the metadata (see https://zenodo.org/dev#restapi-rep for Zenodo Deposition Metadata list):

teatree1212 commented 8 years ago

I m on it

teatree1212 commented 8 years ago

This is focusing on measurements, not image data:

Nuanda commented 8 years ago

Re. contributors there will be a text area provided to put names in line-by-line fashion. No more preprocessing though - only names (we'll mark them as Researchers in Zenodo). So no need to forcefully reuse data_provenance column there.

Re. affiliation and description - as you suggest. Also the description could be extended with additional content when depositing in Zenodo.

Re. subjects - yes, Zenodo requires them to be URLs. I will set J7 as "Subjects" and put J6 in "keywords".

teatree1212 commented 8 years ago

contributors: okay, great

affiliation and description: okay. Question: with your "also..." sentence- do you mean you add a feature that the person is asked to fill out additional fields when clicking "request DOI" (to zenodo)?

subjects: understood. Okay.

Nuanda commented 8 years ago

Yes - the user is able to alter the Name and the Description, and add the list of Contributors.

teatree1212 commented 8 years ago

okay, great solution!

Nuanda commented 8 years ago

The current prototype preview:

mbb_screenshot1

teatree1212 commented 8 years ago

can I suggest some edits:

the idea is that the user doesn't actually notice that this is all happening via a third party service.

call the header ( above the green line):

"request DOI for dataset " in the first blue box:

beneath the title field:
"e.g: a name by with the dataset is known. it can be the plant trial name you submitted to BIP or it can be the title of your upcoming publication where you want to cite this dataset in."

Description:

Otherwise: I would't bother with the Description part and recycle the trial_description part.

Contributors: just be aware that Zenodo asks for the names in this pattern: screen shot 2016-07-29 at 14 41 45 Therefore maybe write something like: "Please provide a list of contributing researchers that helped generating this dataset. list each name in a separate line, in this fashion: Family name, given name " - unless you are going to parse it accordingly.

Nuanda commented 8 years ago

Actually, through all these months I had no idea that you want to hide the fact BIP uses Zenodo for deposition. Anyway, this can't be done, since: (1) in case of errors (like for instance some outage of Zenodo) we need to report that back to the user, and (2) assigned DOIs are resolved to Zenodo, so it's immediately obvious to the users we use Zenodo.

Initial values for Title and Description are copied from PlantTrial columns. User is able to leave them as they are or change them.

Zenodo accepts names in another order as well, so no problem there.

Also, some things like contributors or subjects are not visible on the Zenodo deposition page. They are however in the export JSON you can get from Zenodo for each deposition.

teatree1212 commented 8 years ago

I hear your irritation. I think Wiktors intention is not to confuse people that are nearly at the end of submitting their data to BIP, suddenly their dataset is being sent away somewhere else on top of it being submitted to the BIP.

first blue box: "This action will create a permanent Digital Object Identifier (DOI) for proper citation of this submission data (and metadata related to it)". The DOI is assigned through an external service called Zenodo.org. One of the criteria for a dataset to receive a DOI is that it remains "stable" after submission. Therefore, the data you submit to receive a DOI will be exported to Zenodo to ensure that it remains as it is presented at the time of DOI request. If you add any content to your dataset in the BIP after requesting a DOI,but want that information to be part of the stable data for your publication, you need to request a new identifier through this action. It means that the altered dataset and metadata will receive a new DOI. Therefore, please prepare this submission carefully. We would like to remind you that you are responsible for the contents and specifications of your dataset."

second blue box: sounds good, maybe alter the last bit: .."which you would like to publish along with this dataset when requesting a permanent DOI from Zenodo."

Zenodo and other cross-linked facilities like NCBI and in the future CyVerse for Image submission and analysis tools will need to somehow be displayed on the front page ( which also looks great, fianlly BIP is "integrated") - I will think about that.

teatree1212 commented 8 years ago

green button in the bottom right corner: "request a DOI" rather than "create a deposition"

teatree1212 commented 8 years ago

How long does it take until one gets a DOI? This should probably also be mentioned somewhere for the user: You said something about 1 week, but that has something to do with BIP, correct? I know that in Cyverse, data is manually curated before it gets assigned a DOI. And that takes a while.

Nuanda commented 8 years ago

@teatree1212 Let's not confuse submission to BIP with deposition in Zenodo. This is how these two things work/will work:

  1. User a new submission in BIP (e.g. a Plant Trial submission) and provides data through the multi-step wizard.
  2. User submits the submission to BIP (thus finishing the wizard) and decides, if the submission goes immediately public (for BIP users) or not.
  3. If not public, the user can publish that submission later.
  4. During 1 week after publication, BIP allows the user to delete the data, or hide it as private again (the revocability period).
  5. Afterwards, the data cannot be deleted, made private or altered. Now, as the data is stable, we allow the user to deposit it in Zenodo.
  6. The deposition in Zenodo is set as public right away and the DOI number is being returned immediately (no delay).

I am not sure how the production Zenodo server works, but I think the published deposition is not shown in the Zenodo front page right away. So I think they reserve some time for review/curation, but the DOI is assigned right away.

Nuanda commented 8 years ago

Not as irritated as rather surprised. I think services like BIP might actually like to stress that they are integrated with a thing like Zenodo, and that user's data can be "one-(or two-)clicks published" there, that the submission is therefore backed up twice that way, that people who don't know about BIP might see author's submission in Zenodo etc. etc.

Anyway, changing "Deposit in Zenodo" to "Request a DOI" is not what I find difficult. It's the idea to make "the user doesn't actually notice that this is all happening via a third party service". This is simply not possible (it's instantly obvious anyway that it happened through Zenodo - it's enough to click on the assigned DOI number) and, perhaps, also even not fair to the user, who might want to know that her/his data is going to be sent somewhere else (outside BIP).

Since you (as I understand it) have later proposed to keep the information that we, in fact, use Zenodo, I have no objection to this change.

teatree1212 commented 8 years ago

okay, thanks for this.

teatree1212 commented 8 years ago

about your second comment, I agree. It is better to keep the user informed. But it is also important to stress that this doesn't come with additional work for them but is an automatised submission performed by BIP.