lukascorey / 5-UTR-Mutation-Analysis

Creative Commons Zero v1.0 Universal
3 stars 0 forks source link

What does the motif files under motif_lists mean? #2

Closed zhangzlab closed 2 years ago

zhangzlab commented 2 years ago

Nice work, I want to know what does the motif files under motif_lists directory represent? cisbp_motifs_wmax.txt compendium_motifs_wmax.txt homer_motifs_wmax.txt jaspar_motifs_wmax.txt prte_motif_wmax.txt rbpdb_motifs_wmax.txt

I can guess cisbp_motifs_wmax.txt and compendium_motifs_wmax.txt is RNA binding motif, homer_motifs_wmax.txt is DNA binding motif, what about the others?

Thanks!

lukascorey commented 2 years ago

Hi, Thanks for asking. Sorry, these names are not great and I should've included this information.

cisbp_motifs_wmax.txt are motifs pulled from the CIS-BP RNA Database here: http://cisbp-rna.ccbr.utoronto.ca/ compendium_motifs_wmax.txt are RNA binding motifs from this work: https://www.nature.com/articles/nature12311. Their data is available here: http://hugheslab.ccbr.utoronto.ca/supplementary-data/RNAcompete_eukarya/ homer_motifs_wmax.txt are from the Homer database of DNA binding motifs as you correctly identified (see here: http://homer.ucsd.edu/homer/motif/) jaspar_motifs_xmax.txt refers to the jaspar database of transcription factor binding motifs here: https://jaspar.genereg.net/ prte_motif_wmax.txt refers to the pyrimidine-rich translational element (PRTE) as described here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3663483/ rbpdb_motifs_wmax.txt contains motifs pulled from the RNA-binding proteins database here: http://rbpdb.ccbr.utoronto.ca/

In each case, the wmax refers to these files containing max PWM scores for each motif for comparison to sequences being evaluated. This can be disregarded.

Let me know if I can answer anything else!

zhangzlab commented 2 years ago

Thanks Lukas, it is helpful.