stjude-biohackathon / CRCminer

MIT License
2 stars 1 forks source link

PWM mapping to common gene IDs. #2

Closed j-andrews7 closed 1 year ago

j-andrews7 commented 1 year ago

PWMs rarely have actual gene IDs as their identifiers. We should pick a set or two (JASPAR HUMAN 2023, HOCOMOCO V11 CORE?) to map to common gene identifiers (entrez, ensembl, symbol) to use with common annotation GTFs. This will allow simple incorporation of expression data to limit motifs to "expressed" or active genes when expression info is provided.

A few options to do this:

This is actually trickier than it first seems and will probably require some manual annotation/correction/oversight no matter which method is use.