FAIRplus / the-fair-cookbook

The FAIR cookbook, containing recipes to make your data more FAIR. Find the rendered version on:
https://faircookbook.elixir-europe.org/
126 stars 57 forks source link

create a recipe detailing how to use Biosample implementation of Bioschema in the context of IMI EBISC project #112

Open proccaserra opened 4 years ago

proccaserra commented 4 years ago

Discussed during squad-lead call 2020-07-09: ongoing discussion about biobanks dealing with stem cell and deposition process to Biosample. Biosample serves Bioschema based JSON-LD Purpose of recipe is to provide an applied example of how bioschema markup is performed in the wild & anchoring to an IMI example

discuss 2 situations:

  1. push data to Biosample repository and benefit from the implementation at EBI repo.
  2. pull practice (i.e. Biosample JSON-LD template or associated code) for local implementation.
ghost commented 4 years ago

Dear Recipe team, please submit an abstract on the scope of this recipe, or decline the proposal to write this recipe. @mcourtot @FuqiX @tburdett

Until then, I will remove your responsibility on this topic.

FuqiX commented 4 years ago

Hi, @proccaserra I am a bit confused about what should be included in the recipe. Could you please provide more details?

I can see two parts:

  1. Examples of downloading bioschemas format from BioSamples.
  2. Wrap up the BioSamples bioschema implementation code. as an example of adopting bioschema?

Should

Thanks for the clarification.

proccaserra commented 4 years ago

@FuqiX I guess the content is dictated by how you (i.e EBI biosample) are currently interacting with IMI eBISC/eBISC2. From the earlier f2f meeting, I was under the impression that a submission/deposition workflow was already in place between eBISC and Biosample, with a metadata profile for 'stem cells' already established. Since Biosample has implemented a bioschema annotated JSON-LD, that would cover your first point, even though you may need to indicate what is 'stem cell' specific or if there is nothing specific, state that the JSON-LD reuses a 'generic' bioschema' annotation, to allow for search engine indexing.

But I vaguely recall that eBISC2 representatives indicated that the new project requires stem cells to be further characterized (genotype, cell surface markers, epigenetic marks), vastly broadening the metadata profile to associate to the 'sample'.

Let me know if this help? @tburdett @mcourtot feel free to chime in?

FuqiX commented 4 years ago

Thanks for the clarification.

That's correct. The current "dataflow" is: 1). EBiSC data, JSON -> BioSD,JSON 2). BioSD,JSON -> BioSD, bioschema JSON-LD.

EBiSC can download their bioschema by combining these two steps.

The challenge is the BioSD bioschema is very generic. EBiSC might need a more specific profile indicating "stem cell" specific etc. We had initial discussions with EBiSC.

So the recipe draft will include:

  1. what is the BioSD bioschema profile
  2. how to use the BioSD tools to generate a bioschema JSON-LD
  3. how to extend the BioSD generic profile to more specific ones based on BioSD, e.g. EBiSC-BioSD profile
  4. how to reuse BioSD pipeline to implement bioschema, without interacting with BioSD

Recipe draft here: https://github.com/FAIRplus/the-fair-cookbook/blob/BioSchema_implementation_BioSD/docs/content/recipes/interoperability/bioschema_implementation_in_BioSamples.md