Closed bkille closed 3 years ago
Hey there, @bkille :)
I can't seem to recreate the trouble. I wonder if the downloads are failing to finish properly for some reason, and since I wrote this a while ago when I was even worse at coding than I am now, ha, the program doesn't really check properly that they downloaded without error (right now it just checks there's something in the downloaded file, so as currently written it wouldn't be telling us if there was a problem if the download started and failed).
Can you try with just this accession in the target accessions file and let me know some things (it's a smaller one so it'll be quicker for testing):
GCA_006538345.1
file GCA_006538345.1.fa.gz
gunzip GCA_006538345.1.fa.gz
will give you the same error you've seen so far, but works for me and is 4676621 after uncompressingcurl -o GCA_006538345.1.fa.gz ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/006/538/345/GCA_006538345.1_ASM653834v1/GCA_006538345.1_ASM653834v1_genomic.fna.gz
Yup, same error w/ just that file...
1387386 : GCA_006538345.1.fa.gz
GCA_006538345.1.fa.gz: gzip compressed data, from Unix
Seems like something specific to our server... :thinking: I was able to get the commands to work on a separate device. Still not sure whats going on though lol. Sorry for the false alarm :slightly_smiling_face:
I have a file containing accession IDs named
mouse_acc.txt
.Note that I get this
invalid compressed data--format violated
for all downloaded .gz files. I've also tried running with the accessions and commands from the docsAm I doing something wrong or is this a classic case of NCBI changing things up and breaking peoples code? :slightly_smiling_face: