create a recipe detailing how to use Biosample implementation of Bioschema in the context of IMI EBISC project

proccaserra commented 4 years ago

Discussed during squad-lead call 2020-07-09: ongoing discussion about biobanks dealing with stem cell and deposition process to Biosample. Biosample serves Bioschema based JSON-LD Purpose of recipe is to provide an applied example of how bioschema markup is performed in the wild & anchoring to an IMI example

discuss 2 situations:

push data to Biosample repository and benefit from the implementation at EBI repo.
pull practice (i.e. Biosample JSON-LD template or associated code) for local implementation.

ghost commented 4 years ago

Dear Recipe team, please submit an abstract on the scope of this recipe, or decline the proposal to write this recipe. @mcourtot @FuqiX @tburdett

Until then, I will remove your responsibility on this topic.

FuqiX commented 4 years ago

Hi, @proccaserra I am a bit confused about what should be included in the recipe. Could you please provide more details?

I can see two parts:

Examples of downloading bioschemas format from BioSamples.
Wrap up the BioSamples bioschema implementation code. as an example of adopting bioschema?

Should

what's bioschema
why use bioschema
how to create a bioschema markup (e.g. for EBiSC cell lines) Also part of the recipe?

Thanks for the clarification.

proccaserra commented 4 years ago

@FuqiX I guess the content is dictated by how you (i.e EBI biosample) are currently interacting with IMI eBISC/eBISC2. From the earlier f2f meeting, I was under the impression that a submission/deposition workflow was already in place between eBISC and Biosample, with a metadata profile for 'stem cells' already established. Since Biosample has implemented a bioschema annotated JSON-LD, that would cover your first point, even though you may need to indicate what is 'stem cell' specific or if there is nothing specific, state that the JSON-LD reuses a 'generic' bioschema' annotation, to allow for search engine indexing.

But I vaguely recall that eBISC2 representatives indicated that the new project requires stem cells to be further characterized (genotype, cell surface markers, epigenetic marks), vastly broadening the metadata profile to associate to the 'sample'.

[x] Is my recollection correct?
[x] if so, then how is this handled by Biosample and Bioschema (assuming a deposition to Biosample)?
[x] if no deposition to Biosample but implementation by eBISC2 participants in their local database, do they implement a schema.org/bioschema.org JSON-LD component. Is it aligned with that of Biosample?

Let me know if this help? @tburdett @mcourtot feel free to chime in?

FuqiX commented 4 years ago

Thanks for the clarification.

That's correct. The current "dataflow" is: 1). EBiSC data, JSON -> BioSD,JSON 2). BioSD,JSON -> BioSD, bioschema JSON-LD.

EBiSC can download their bioschema by combining these two steps.

The challenge is the BioSD bioschema is very generic. EBiSC might need a more specific profile indicating "stem cell" specific etc. We had initial discussions with EBiSC.

So the recipe draft will include:

what is the BioSD bioschema profile
how to use the BioSD tools to generate a bioschema JSON-LD
how to extend the BioSD generic profile to more specific ones based on BioSD, e.g. EBiSC-BioSD profile
how to reuse BioSD pipeline to implement bioschema, without interacting with BioSD

Recipe draft here: https://github.com/FAIRplus/the-fair-cookbook/blob/BioSchema_implementation_BioSD/docs/content/recipes/interoperability/bioschema_implementation_in_BioSamples.md

FAIRplus / the-fair-cookbook

create a recipe detailing how to use Biosample implementation of Bioschema in the context of IMI EBISC project #112