liulab-dfci / CHIPS

A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data
MIT License
19 stars 3 forks source link

non-model species user for this software #8

Closed zhangzhiyangcs closed 3 years ago

zhangzhiyangcs commented 3 years ago

Hi, Thanks for your convenient software, I run this with my species and get some issues. How I product GDC_hg38.refGene files using my genome annotation files. I don't understand the #bin column meaning in the blow content. Do you have some scripts to directly transfer the genome annotation file to meet this requirement. Thanks a lot.

"""#bin name chrom strand txStart txEnd cdsStart cdsEnd exonCount exonStarts exonEnds score name2 cdsStartStat cdsEndStat exonFrames 0 ENST00000371007.5 chr1 - 67092164 67231852 67093004 67127240 8 67092164,67095234,67096251,67115351,67125751,67127165,67131141,67231845, 67093604,67095421,67096321,67115464,67125909,67127257,67131227,6723185 0 ENST00000371006.4 chr1 - 67092175 67127261 67093004 67127240 6 67092175,67095234,67096251,67115351,67125751,67127165, 67093604,67095421,67096321,67115464,67125909,67127261, 0 C1orf141 cmpl cmpl """

zhangzhiyangcs commented 3 years ago

okay, I get it. It is a GenePred table format. ""gtfToGenePred -genePredExt -ignoreGroupsWithoutExons -geneNameAsName2 test.gtf test.gpd""