Open lparsons opened 5 years ago
I think we should probably disregard filetypes sent by external parties at this point. Seems we'd be better off relying on our sniffers.
I'd be in favor of an additional flag to force override of sniffers. That way servers that aren't updated (ENA) would get the new behavior, but things that want to be specific still could. The main issue I see is that fastq sniffers generally assign type "fastq" and not "fastqsanger", rendering the files useless without a completely pointless Fastq Groomer run. Unless that behavior has changed?
We are sniffing fastqsanger since https://github.com/galaxyproject/galaxy/pull/4237 (does not cover all cases obviously)
Yes, we sniff fastqsanger if the quality values are compatible with sanger encoding. We will also soon have a colorspace sniffer (but that data isn't much used anymore). Everything else will be flat fastq, as the Illumina and Solexa variants are not easy to discriminate.
I believe this is due to the addition of the
fastqsanger.gz
filetype. The ENA is "assigning" a filetype offastqsanger
(which used to work) and Galaxy is accepting that, and even shows a correct "peek" of the fastq file in the history. However, tools (e.g. FastQC) run against the file will fail, complaining about the format (e.g. line does start with the@
character).This isn't really a Galaxy bug per se, but it is an issue with the Galaxy experience for users.