saezlab / CollecTRI

Gene regulatory network containing signed transcription factor-target gene interactions
GNU General Public License v3.0
54 stars 6 forks source link

How to filter non-academic use license resources #15

Closed romunov closed 5 months ago

romunov commented 5 months ago

To avoid using any unlicensed resources, I would like to filter out the SIGNOR entries as noted in the license of CollecTRI repository:

The CollecTRI-derived regulons are freely available for academic use. For commercial use please remove TF-gene interactions from SIGNOR.

Looking at the resource file, conveniently uploaded to Zenodo (#7 thanks!), it's not entirely clear to me how to filter these entries out. I would really appreciate if you could confirm my suspicion below.

For example, entry

source  target  weight                                     resources           references        sign_decision
NFKB1     NPPB     1.0   CollecTRI;SIGNOR_CollecTRI;TRRUST_CollecTRI   CollecTRI:15837525   default activation

lists CollecTRI;SIGNOR_CollecTRI;TRRUST_CollecTRI (TRRUST as part of DoRothEA) as resources. Am I correct to assume that source target pair of NFKB1 -> NPPB appears in SIGNOR and TRRUST? Going a step further, would this mean that this entry can be retained if SIGNOR resources are being omitted? If yes, then entries that have only SIGNOR as resource should be omitted?

smuellerd commented 5 months ago

Hi @romunov,

Yes, your understanding is correct, you would only have to exclude any TF-gene interaction exclusively derived from SIGNOR. Thus, as you already described, your example would not need to be removed.

We're currently also working on incorporating this in our functions by providing the option to filter CollecTRI for non-academic use. I will make sure to let you know as soon as it becomes available.

Best regards Sophia

romunov commented 5 months ago

It would seem this can also be done:

import decoupler as dc
from omnipath.constants import License

ct = dc.get_collectri(organism="human", license=License.COMMERCIAL)

List of licenses can be found here: https://github.com/saezlab/omnipath/blob/main/omnipath/constants/_constants.py#L69

smuellerd commented 4 months ago

Thanks @romunov! SIGNOR also just changed its licence to CC BY (https://signor.uniroma2.it/, see news section) so there is not longer a need to filter it out.

Best regards Sophia

romunov commented 4 months ago

This is so easy to miss, thank you for pointing it out!