vierstralab / motif-clustering

Clustering motif models to remove redundancy
37 stars 8 forks source link

Archetype PFM #2

Closed fgualdr closed 2 years ago

fgualdr commented 3 years ago

Hi,

Thanks for adding this great resource to the field. I was trying to look for a Pam or Pam file containing the archetype motifs as of the Nature paper. Is running the whole script the only way to get that information or within the available resources the motifs are already there?

Thanks a lot

jvierstra commented 3 years ago

The archetype motifs don't exist as a singular PFM/PWM. The represent a collection (e.g., cluster) of motifs of which each constituent model is referenced to the cluster.

In the clustering file (https://resources.altius.org/~jvierstra/projects/motif-clustering/releases/v1.0/motif_annotations.xlsx), you will find 2 workbooks ("Archetype clusters" and "Motifs"). The clustering worksheet lists the 286 clusters and the "seed motif" (arbitrary). The motifs worksheet gives the relationship of each motif to its cluster. The "left_offset" and "right_offset" columns are relative to the cluster seed motif.

The genome-wide scans provided are the result of the scanning each motif model independently across the genome and adjusting the strand and corrdinates by values relative to cluster. Duplicate instances of the adjusted motif matches are then collapsed.

Hope this helps.

Jeff