gogetdata / ggd-recipes

conda recipes for genomic data
MIT License
85 stars 12 forks source link

Danio rerio (zebrafish) gsort genome file #57

Closed idaL closed 5 years ago

idaL commented 5 years ago

Hi! We are trying to create recipes for ftp://ftp.ensembl.org/pub/release-91/gtf/danio_rerio/Danio_rerio.GRCz10.91.gtf.gz and ftp://hgdownload.soe.ucsc.edu/goldenPath/danRer10/bigZips/danRer10.fa.gz

Can we have the associated genome files used with gsort for them?

arq5x commented 5 years ago

Fantastic! One can get this from UCSC as well. The first 2 columns of the chromInfo.txt.gz file give the chrom and length for use with gsort.

curl -s ftp://hgdownload.soe.ucsc.edu/goldenPath/danRer10/database/chromInfo.txt.gz | zless | cut -f 1,2 | head
chr4    76625712
chr7    74082188
chr5    71715914
chr3    62385949
chr6    60272633
chr2    59543403
chr1    58871917
chr9    56892771
chr16   55381981
chr20   55370968
idaL commented 5 years ago

Thank you @arq5x ! Do you have it for the ensembl file as well?

mikecormier commented 5 years ago

See https://github.com/gogetdata/ggd-recipes/issues/58#issuecomment-496594312 for new genome files created for Danio_rerio GRCz10 and danRer10.

mikecormier commented 5 years ago

https://github.com/gogetdata/ggd-recipes/issues/58#issuecomment-497042984