saezlab / OmnipathR

R client for the OmniPath web service
https://r.omnipathdb.org/
Other
105 stars 20 forks source link

PubMed IDs for edge interactions #15

Closed bio-bean closed 4 years ago

bio-bean commented 4 years ago

Hi, thanks so much for making this package, its really great and helpful!

I was wondering if there was a way to extract the PubMed IDs for edge interactions as in the PyPath module for python? I can't seem to find it in the interactions databases that I can download, so I wasn't sure if it was a download parameter or if that option wasn't available in the R package.

I've been downloading (all) interaction data like so:

interactions <- import_AllInteractions(from_cache_file = NULL,
                                        filter_databases = get_interaction_databases(),
                                        select_organism = 9606)

Thanks very much!

deeenes commented 4 years ago

Hi @bio-bean,

I am doing some refactoring and development on OmnipathR in these days, I gonna have a look also on this. Looking at the current source code: https://github.com/saezlab/OmnipathR/blob/master/R/import_Omnipath.R#L376 I see fields=references is there, which means you should have a column references with the PubMed IDs for each interaction, isn't it?

Best,

Denes

bio-bean commented 4 years ago

Hi Denes,

When I download the data as above, I get a references column but they have references to specific databases and (possibly an internal/within OmniPath ID?), like so:

source target source_genesymbol target_genesymbol is_directed is_stimulation is_inhibition consensus_direction consensus_stimulation consensus_inhibition dip_url sources references nsources nrefs
P0DP23 P48995 CALM1 TRPC1 1 0 1 1 0 1 TRIP TRIP:11290752;TRIP:11983166;TRIP:12601176 1 3

Which is great that I know what database they came from, but I haven't been able to trace back to the pubmed IDs or literature references themselves, especially in sources like BioGRID or PhosphoSite.

Thanks so much for getting back to me!

deeenes commented 4 years ago

These are PubMed IDs, the first one in your example points to a paper about TRP channels: https://pubmed.ncbi.nlm.nih.gov/11290752/ We added the resource names recently because someone asked for it. I will add an option to OmnipathR to remove the resource labels.

bio-bean commented 4 years ago

Oh gosh! Thank you so much for pointing that out to me, sorry for missing this!

bio-bean commented 4 years ago

Hi, sorry I just have a follow up question related to this. When I look in the interactions databases (as above), there is a source in the source column called "Wang", but nothing more in the reference column - I was wondering what source this was, I can't seem to find anything on it.

deeenes commented 4 years ago

Hi, The Wang label refers to this resource: http://omnipathdb.org/info#HSN It doesn't provide PubMed IDs hence the references column is empty unless there are references from another resource. Wang (HSN) is supposed to be literature curated although they don't disclose much information about their methods and provide the interactions with very minimal attributes (only direction and sign).