earth-metabolome-initiative / emikg

Python package to handle the complete pipeline and APIs relative to EMIKG graph
https://www.earthmetabolome.org/
MIT License
1 stars 0 forks source link

chemical-plant sparql query #2

Open YojanaGadiya opened 7 months ago

YojanaGadiya commented 7 months ago

Dear EMIKG team,

Since I have limited SPARQL query formulation skills, I was wondering if you or the community could assist me in building one to extract chemical-plant relations from the graph.

Additionally, on looking into the underlying schema of the graph, it looked like these edges come from WikiData. Is that correct?

Thank You.

Regards, Yojana

oolonek commented 7 months ago

Hi @YojanaGadiya,

Thanks for getting in touch.

The Examples section of the Wikidata Query Service https://query.wikidata.org/ is an excellent ressource to start learning SPARQL.

More specifically if you want to query the ENPKG a good starting point would be for you to explore the queries presented in Table 1 of the preprint (https://doi.org/10.26434/chemrxiv-2023-sljbt-v2). Do let us know if you need help in adapting some of them. Some nodes correspond indeed to Wikidata objects. E.g. for a taxon https://enpkg.commons-lab.org/graphdb/graphs-visualizations?saved=763f2599add54363b4fdcfe4830ac63a or a molecule https://enpkg.commons-lab.org/graphdb/graphs-visualizations?saved=ac772b1f386a42b69b7786d632c12a7a.

YojanaGadiya commented 7 months ago

Hi @oolonek,

These are great pointers! Just one last question. Were there any stats on the difference between the EMIKG and LOTUS specifically on the taxon-molecule edges?

Regards, Yojana Gadiya

oolonek commented 7 months ago

Hi @YojanaGadiya ,

At the moment the EMIKG is under construction. I define here these acronyms as I understand they can be misleading.

I hope this helps. Also to keep in mind, the objective of ENPKG/EMIKG is to gather and organize massive amount of putative biological source - chemical structures links established through mass spectrometry based metabolomics experiments. The purpose of LOTUS is to gather biological source - chemical structures links supported by a scientific publication and usually obtained through a physical isolation of the molecule from the biological source.

Regarding current stats here are some figures:

I hope this helps. Let us know if you have any further question.