BackofenLab / Cherri

https://backofenlab.github.io/Cherri/
GNU General Public License v3.0
0 stars 0 forks source link

Find homologes in human and mouse training data #32

Open teresa-m opened 2 years ago

teresa-m commented 2 years ago

Idea: find a few examples of homology RRIs in human and mouse training data. Homologe x of the mouse can be then tested if it will be correctly predicted by the 'human model' and vice versa.

TODO:

pavanvidem commented 2 years ago

First, it is better to find some evolutionarily conserved interactions from the literature. For eg, U1 snRNA and MALAT1 lncRNA interaction is conserved between human and mouse (see original PARIS paper). It is enough to have a handful of such interactions to prove that the models are robust enough to detect cross-species conserved interactions.

domonik commented 2 years ago

Find homologs via: https://rest.ensembl.org/ https://rest.ensembl.org/documentation/info/homology_ensemblgene

only possible for genes not for transcripts. Thus, needs to select most likey transcript. eg. pairwise alignment. It is possible to extract all transcripts via the api: https://rest.ensembl.org/documentation/info/overlap_id just like the provided example with the flag feature=transcript and afterwards extract the transcript sequence. https://rest.ensembl.org/documentation/info/sequence_id make sure to get the spliced version via: /sequence/id?type=cdna

Disadvantage: Uses a lot of api calls and might take some time