Open Lidw2020 opened 1 year ago
Can you please explain how the motif PWM files can be linked to TFs? I had thought that the PWM files were named by the motif id field in the motif2tf database, but only about 40% of the PWM file names match an entry in the mgi database table. Can you please document how the PWM files should be annotated with TFs? Perhaps with an example on the repository pages for the motif2tf databases
Hi @KyleFerchen
You are in fact linking the motifs correctly to the TF names. It's correct that only 40% of the files match with the motif2tf file.
This number might seem low. However, you have to keep in mind that some of the files contain multiple motifs (the files named metacluster_*.cb
).
Best,
Seppe
@KyleFerchen Also not all PWMs have a known or specific TF attached to it. Associating a PWM with the correct TF is a big challenge.
Thank you for clarifying!
However, I'm wondering why I can see some examples like cisbp__M08984 in the motif2tf file, which doesn't seem to have a pwm file in the "singletons" directory. Yet, this motif is annotated in the cisbp2 database with a pwm matrix here.
Is this entry in the motif2tf file collapsed to another PWM in some way?
Not all PWMs listed in the motif to TF file are in the singletons directory as some motifs from different collections are exactly the same:
swissregulon__mm__Gfi1 = swissregulon__rn__Gfi1 = cisbp__M08905 cisbp__M08984
And swissregulon__mm__Gfi1
itself is in metacluster_48.3
.
Our SCENIC+ public motif collection is now available: https://resources.aertslab.org/cistarget/motif_collections/
moitf2TF snapshots can be found at: https://resources.aertslab.org/cistarget/motif_collections/v10nr_clust_public/snapshots/
To create annotation for other species try to find orthologs for your species for the gene name mentioned in the
gene_name
(column 6) column and replace it with the gene name of your species. If you don't have an ortholog, for that gene, delete the line.