3UTR / DaPars2

Dynamics analysis of Alternative PolyAdenylation from RNA-seq
GNU General Public License v2.0
47 stars 22 forks source link

Generating ID_mapping_FILE #19

Open ytakemon opened 1 year ago

ytakemon commented 1 year ago

Hello,

I'm trying to generate the files needed to extract the 3'UTR from the hg19 gene annotation (see below). The instructions above show how the hg38_wholeGene_annotation.bed file was generated from the UCSC Table browser, but how was the hg38_refseq_IDmapping.txt file generated?

The link to the UCSC Table browser on your wiki only shows the top half of the figure, but it wasn't clear how to find the Select Fields from hg38.ncbiRefSeq section, which I suspect generates the hg38_refseq_IDmapping.txt file.

python DaPars_Extract_Anno.py -b hg38_wholeGene_annotation.bed -s hg38_refseq_IDmapping.txt -o hg38_3UTR_annotation.bed
gadiscymraes commented 2 months ago

After playing around I think I have the answer to this. After you have downloaded the hg38_wholeGene_annotation.bed file from the UCSC Table browser, go back to UCSC Table browser and change the "output format" to "selected fields from primary and related tables" and the "output filename" to "hg38_refseq_IDmapping.txt". Then click get output and you will get the list shown in the documentation. Check the "name" and "name2" boxes and click "get output"