t-neumann / slamdunk

Streamlining SLAM-seq analysis with ultra-high sensitivity
GNU Affero General Public License v3.0
39 stars 23 forks source link

UTR regions #130

Closed ChristianRohde closed 1 year ago

ChristianRohde commented 1 year ago

Hi,

do you have any updated recommendations how one should generate a fresh 3'UTR region bed file for slamdunk? I still use the GSE100708_hg38_refseq_062016_ensemblv84_3UTR.bed file which I downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE100708 which refers to your initial Science publication. There you say "Gene and 3' UTR annotations were obtained from the UCSC table browser (https://genome.ucsc.edu/cgi-bin/hgTables, June 2016). 3’ UTR annotations were assigned to Entrez GeneIDs and collapsed on a per-gene basis using bedtools’ merge command (38). For genes lacking an annotated 3' UTR, Ensembl v84 3' UTRs were added if available, resulting in a total of 58136 annotated 3' UTR intervals for 25420 genes." I wonder if you set up a workflow in the meantime using a GTF file either from Ensembl or Genecode. At least with some of the annotations I have trouble to understand why regions appear:

Screenshot 2023-04-20 at 10 23 42

Best, Christian

t-neumann commented 1 year ago

Hi Christian,

Pooja Bhat actually published some software for this:

https://pubmed.ncbi.nlm.nih.gov/34183122/

https://github.com/AmeresLab/3-GAmES

Hope that helps

Tobi