Closed foellmelanie closed 5 years ago
Thanks for the report @foellmelanie. The datatype for analyze75 files says that the hdr file is not binary:
"""The header file. Provides information about dimensions, identification, and processing history."""
self.add_composite_file(
'hdr',
description='The Analyze75 header file.',
is_binary=False)
While the test data in the tool that fails is binary https://github.com/galaxyproteomics/tools-galaxyp/blob/f127be2141cf22e269c85282d226eb16fe14a9c1/tools/cardinal/test-data/Analyze75.hdr
I assume the datatype is wrong and this hdr file can be binary ? In that case we need to change the datatype. We are now more strict when converting universal newlines and require files to actually be text files when we do this. I guess on top of the datatype fix we might want to also ignore failed newline conversions.
@mvdbeek thanks for resolving this so fast!
Unfortunately I have a similar problem with another composite datatype: 'imzml'.
The upload of some files has worked while others gave the following error:
Traceback (most recent call last):
File "/cvmfs/main.galaxyproject.org/galaxy/tools/data_source/upload.py", line 329, in <module>
__main__()
File "/cvmfs/main.galaxyproject.org/galaxy/tools/data_source/upload.py", line 320, in __main__
metadata.append(add_composite_file(dataset, registry, output_path, files_path))
File "/cvmfs/main.galaxyproject.org/galaxy/tools/data_source/upload.py", line 243, in add_composite_file
stage_file(name, composite_file_path, value.is_binary)
File "/cvmfs/main.galaxyproject.org/galaxy/tools/data_source/upload.py", line 223, in stage_file
sniff.convert_newlines(dp, tmp_dir=tmpdir, tmp_prefix=tmp_prefix)
File "/cvmfs/main.galaxyproject.org/galaxy/lib/galaxy/datatypes/sniff.py", line 122, in convert_newlines
for i, line in enumerate(io.open(fname, mode="U", encoding='utf-8')):
File "/cvmfs/main.galaxyproject.org/venv/lib/python2.7/codecs.py", line 314, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb5 in position 1358: invalid start byte
Should we create a new issue about this other datatype?
I'd say in general yes, keeps it easier to check when which bug has been fixed and in which commit. I'll have a look.
I'm a little confused by the imzml datatype, if I understand the specs (https://ms-imaging.org/wp/wp-content/uploads/2009/08/specifications_imzML1.1.0_RC1.pdf) correctly the metadata file should be xml (so not binary, I guess) ... can someone confirm that whether this file is supposed to be text or binary ?
This is correct, or at least my understanding. https://github.com/galaxyproteomics/tools-galaxyp/tree/master/tools/cardinal/test-data (imzml=xml + ibd=binary)
It's correct. Its a bit confusing because the composite imzMLfile consists of an imzML subfile (xml) and ibd subfile (binary).
So any chance that I could my hands on a file that fails the upload ?
https://github.com/galaxyproteomics/tools-galaxyp/blob/master/tools/cardinal/test-data/Example_Processed.imzML is in "ISO-8859-1" encoding, if you use recode before uploading to Galaxy it should work fine. We probably need some logic to handle non-default encodings, but I don't think it'll happen immediately.
Thank you @mvdbeek for pointing this out. The weird thing is that this file has worked before in previous Galaxy versions.
The imzml example should work on 19.05, which should be released pretty soon. Since that was a larger change we'll not backport this to 19.01. Many thanks for the report @foellmelanie !
Hi,
I tried to upload Analayze 75 files to usegalaxy.org and I got the following error message:
Planemo with 18.09 is working with Analyze 75 files but not Planemo with 19.01 version: https://github.com/galaxyproteomics/tools-galaxyp/pull/350