Open krassowski opened 4 years ago
Mike: Looks like a great tool and exercise!!! Sure, we are getting into BEL/ text mining I guess ?!?
Fine as long as we answer the first 2 questions or alike:
[1] do we have a problem with tools being developed on cancer data only? - or other diseases, vs plants vs bacteria vs environmental ? [2] are we developing methods using human data only? - vs mice, nematode, bacteria, E. coli, rats look fine too!!!
[3] I am not sure with this 3rd question: "are any genes over represented in multi-omics research (like master regulators microRNAs/TP53)": given handful of total "multiomics" studies, we might accumulate a lot of noisy results, simply because "Cancer is over represented" in these studies etc. And also problem is, "they are over represented true, just because they are "mentioned" in the "text" or actually "found as a top hit by FC/ P-value" or "just cited as as an evidence/ background/ knowledge"! Its going to be painful to curate that. We (essentially your precious time and skills!) may not invest that much time in generating a figure again with challenges.
[4] Can we add a question that needs to be answered (As I could think of now!):
Question: How many papers used the combination of following multiomics ?
-genomics + transcriptomics + proteomics + metabolomics + microbiome + imaging + SNP + epigenetics + variations + etc. + etc..
-genomics + transcriptomics + proteomics + metabolomics + microbiome + imaging + SNP + epigenetics
--genomics + transcriptomics + proteomics + metabolomics + microbiome + imaging + SNP
-genomics + transcriptomics + proteomics + metabolomics + microbiome + imaging
-genomics + transcriptomics + proteomics + metabolomics + microbiome
-genomics + transcriptomics + proteomics + metabolomics
-genomics + transcriptomics + proteomics
-genomics + transcriptomics (does not qualify as 'dual omics').
-AND COMBINATIONS of all the above such as - genomics + metabolomics + microbiome etc..... (let me know if it makes sense!)
[5] Will think of more questions and come back to you for more "figures"!
Note: Keep generating the Figures and the ones that will remain unused here, we can use them up for our Paper 2 (mini review, for the one that Vivek got the invitation for!!)
Thanks a lot for this exercise!
Best, Biswa
Great point [4] on looking for combinations of different omics/data types!
Re [3] - yes, I am aware that it will not be trivial to interpret due to over-representation of microRNA/cancer studies in the first place, but might be fun either way. There was a study showing that majority of the studies focuses on minority of the genes and this is a lot of like "fashion/yearly trend". Just curious if we can show something similar in multi-omics field.
There is an API which allows to easily retrieve pre-calculated tags of PubMed papers, with extracted bioconcepts including gene, chemical, disease, mutation and species:
This can be very useful (along with MeSH headers) to answer questions like:
Link: https://www.ncbi.nlm.nih.gov/research/bionlp/APIs/