Open dbdimitrov opened 1 year ago
@deeenes also please use the is_ppi
flag, I found a lot of erroneous interactions between enzymes and receptors. (no metabolite)
Maybe how I process it here would help:
Hi Daniel,
As far as I could find, pypath i already using the CellPhoneDB git as a source for the data (see here and then here), so I think it is already using the v5 version of the data.
What I found out now when checking this, is that although the retrieval of interactions works fine:
> from pypath.inputs import cellphonedb
> list(cellphonedb.cellphonedb_interactions())[-1]
CellphonedbInteraction(id_a='P16070', id_b='O43914', sources='CellPhoneDB', references='', interaction_type='unknown-unknown', type_a='unknown', type_b='unknown')
When you try to retrieve the ligand-receptor interactions it returns a tuple of empty sets:
> cellphonedb.cellphonedb_ligands_receptors()
(set(), set())
This seems to be an issue in how the complex annotations were being imported, and therefore the ligand/receptor attributes were being all labeled as False
, I think I fixed it in #279
Regarding the use of is_ppi
flag, seems a bit more complex to implement (and I wouldn't want to break anything), so maybe we can discuss in person and I could try to take a look into it, or we can wait for @deeenes to come back :sweat_smile:
Since this should resolve your initial question, I'll close the issue and we can discuss the is_ppi
thing later :)
Best
@Nic-Nic thanks Nico. Though I would say the is_ppi
is crucial since there are now a lot of enzyme-enzyme interactions imported ad ligand-receptors 😅
I renamed the issue and reopened since the two comments are tied. The flag was introduced along with the update of the database. 🙂
PS. Also, there is no need to implement the flag, it's simply about setting it to False, when whe resource is obtained. We don't want to include those, and I can think of limited use of having them even if we do.
Added the flag to the import method of the interactions database from CellPhoneDB (see #281). The decision on whether to filter out the False
ones or not, is more for @deeenes to take :sweat_smile:
Since the flag is now there (once the PR is merged), you can easily then apply the filter in your code if you deem it necessary :)
Hey Nico, thanks a lot.
I think it should definitely be False
to default, or at least the clients should have it as false if possible - though that might be more work.
In short, they assume that the last production enzyme of a metabolite in one cell type, and a receptor/enzyme of another translate to the metabolite-receptor interaction. I think it's very specific to be pull by default as ligand-receptor interactions by the clients :)
Hey @deeenes @Nic-Nic,
It seems to me that the solution we discussed yesterday for liana, i.e. access the databases via the client, will not work if we don't filter the non-ppis here.
These non-ppis are either way incorporated into MetalinksDB, so for our usecases we don't need them.
So, I'm re-opening the issue. Let me know if you want me to add the line that the dataframe.
Daniel
Hey @dbdimitrov, you're right, having the attribute itself doesn't result in the removal of those interactions. We need two little things:
1) This is one of the few tasks that belongs to the scope of integration (between OmniPath & LIANA), so there should be one line either in LIANA or in omnipath Python that makes sure is_ppi=True
is removed;
2) In the OmniPath network dataset definitions, the is_ppi
interactions should go into a separate dataset, definitely not to the ligand-receptor one (this makes the prev. point redundant, but better to be safe, it doesn't cost anything)
We'll soon take care of these
Ping @deeenes, it will become time sensitive very soon :smile:
@deeenes :eyes:
@deeenes I am curious, if one is to use cellphonedb via Liana, what is the effect of this is_ppi not functioning correctly? Will we see erroneous interactions between enzymes and receptors?
It is unclear to me because it seems like an integration between omnipath and liana, and this issue may have been addressed in the liana repo since this discussion.
Hey Denes,
Recently, CellPhoneDB got bumped to v5, and the data is stored here: https://github.com/ventolab/cellphonedb-data/tree/master
Seems to have changed format from: https://github.com/saezlab/pypath/blob/bf81f34120b82157fa3ebc15d39b0489b97fbe5e/pypath/resources/urls.py#L1103
Let me know if I can help with this. Daniel