Open smondet opened 8 years ago
Did #118 fix this?
@arahuja no I found no way to tell nicely to gunzip
to ignore those errors without ignoring other potential errors.
The options are:
fasta.gz
somewhere else (easy/fast work-around)What do you think?
Hm, self-hosting seems like a solution that will have it's own issues eventually - unless we just put in this repo?
Ignoring gzip errors and computing sums is nice, but not sure how that is manageable for all downloads.
Looks like 1000genomes acknowledges this issue with the file as well: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/README_human_reference_20110707
This file is compressed by razip from the samtools package for random access. Gzip may complain "decompression OK, trailing garbage ignored", but this does not affect the correctness of the decompressed file.
I think just putting that file here or in Github LFS is the easiest for now.
@smondet what is this blocked on?
@hammer it's bolocked on either 1000genomes providing a proper gz
file or us taking a decision on how to bypass the problem :)
(I'd like to implement the MD5 solution one day but self-hosting seems to me like the fastest route)
@smondet sounds like we're not blocked then, we should implement the self-hosting workaround.
Gunzip succeeds but displays
decompression OK, trailing garbage ignored
and returns2
.-q
silences the warning: http://www.gzip.org/#faq8 but does not make it return0
.(URL: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz)