galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.41k stars 1.01k forks source link

Problem building metadata with readMSData tool in a workflow on a collection #16261

Closed abretaud closed 3 weeks ago

abretaud commented 1 year ago

Describe the bug

I'm getting this error while running the MSnbase readMSData tool, when run from a workflow on a list collection:

galaxy.jobs.runners ERROR 2023-06-16 10:15:36,794 [pN:handler_1,p:187328,tN:SlurmRunner.work_thread-2] (3582) Failure preparing job
Traceback (most recent call last):
  File "/galaxy/server/lib/galaxy/jobs/runners/__init__.py", line 268, in prepare_job
    stream_stdout_stderr=stream_stdout_stderr,
  File "/galaxy/server/lib/galaxy/jobs/runners/__init__.py", line 309, in build_command_line
    stream_stdout_stderr=stream_stdout_stderr,
  File "/galaxy/server/lib/galaxy/jobs/command_factory.py", line 156, in build_command
    __handle_metadata(commands_builder, job_wrapper, runner, remote_command_params)
  File "/galaxy/server/lib/galaxy/jobs/command_factory.py", line 281, in __handle_metadata
    kwds={"overwrite": False},
  File "/galaxy/server/lib/galaxy/jobs/__init__.py", line 2214, in setup_external_metadata
    **kwds,
  File "/galaxy/server/lib/galaxy/metadata/__init__.py", line 226, in setup_external_metadata
    "model_class": dataset_collection.__class__.__name__,
  File "/galaxy/server/lib/galaxy/model/store/__init__.py", line 2378, in __exit__
    self._finalize()
  File "/galaxy/server/lib/galaxy/model/store/__init__.py", line 2263, in _finalize
    collections_attrs_out.write(to_json(self.collections_attrs))
  File "/galaxy/server/lib/galaxy/model/store/__init__.py", line 2244, in to_json
    return json_encoder.encode([a.serialize(self.security, self.serialization_options) for a in attributes])
  File "/galaxy/server/lib/galaxy/model/store/__init__.py", line 2244, in <listcomp>
    return json_encoder.encode([a.serialize(self.security, self.serialization_options) for a in attributes])
  File "/galaxy/server/lib/galaxy/model/__init__.py", line 362, in serialize
    return self._serialize(id_encoder, serialization_options)
  File "/galaxy/server/lib/galaxy/model/__init__.py", line 6239, in _serialize
    collection=self.collection.serialize(id_encoder, serialization_options),
  File "/galaxy/server/lib/galaxy/model/__init__.py", line 362, in serialize
    return self._serialize(id_encoder, serialization_options)
  File "/galaxy/server/lib/galaxy/model/__init__.py", line 6006, in _serialize
    elements=list(map(lambda e: e.serialize(id_encoder, serialization_options), self.elements)),
  File "/galaxy/server/lib/galaxy/model/__init__.py", line 6006, in <lambda>
    elements=list(map(lambda e: e.serialize(id_encoder, serialization_options), self.elements)),
  File "/galaxy/server/lib/galaxy/model/__init__.py", line 362, in serialize
    return self._serialize(id_encoder, serialization_options)
  File "/galaxy/server/lib/galaxy/model/__init__.py", line 6624, in _serialize
    rval["child_collection"] = element_obj.serialize(id_encoder, serialization_options)
AttributeError: 'NoneType' object has no attribute 'serialize'

No error when running the tool outside of a workflow, or without a list collection

Galaxy Version and/or server at which you observed the bug Galaxy Version: 23.0 Commit: 6cde0781

To Reproduce Steps to reproduce the behavior:

  1. Go to usegalaxy.fr
  2. Use this dead simple workflow: https://usegalaxy.fr/u/abretaud/w/msnmachin
  3. Invoke it on a list collection with a single mzxml element (this file for example)
  4. See error

Additional context It looks similar to https://github.com/galaxyproject/galaxy/issues/11510 but probably slightly different as this one has been fixed

abretaud commented 1 year ago

Any clue on that? The tool seems to be quite used, we have several users complaining about it, and it doesnt look easy to debug for us :/

By the way, the problem did not happen 22.05

abretaud commented 1 year ago

Ok, it comes from this line in the tool: https://github.com/workflow4metabolomics/tools-metabolomics/blob/master/tools/msnbase_readmsdata/msnbase_readmsdata.xml#L38

which triggers Tool toolshed.g2.bx.psu.edu/repos/lecorguille/msnbase_readmsdata/msnbase_readmsdata/2.16.1+galaxy0 output sampleMetadata: dataset output filter (input.extension not in ["mzxml","mzml","mzdata","netcdf"]) failed: 'DatasetCollectionElement' object has no attribute 'extension'

Tried replacing with .is_of_type or .ext, but they're not available on a DatasetCollectionElement

abretaud commented 1 year ago

Found a workaround that seems to work, changing the tool filter to this:

<filter>(not input.hasattr(input, "dataset") and input.is_of_type("mzxml","mzml","mzdata","netcdf")) or (input.hasattr(input, "dataset") and input.dataset.is_of_type("mzxml","mzml","mzdata","netcdf"))</filter>

However I wonder if it's intended that we receive a DatasetCollectionElement object?

yguitton commented 1 year ago

Hi @abretaud

Thanks for the investigations, now the readMSdata is working in workflows but the sampleMetadata output as datacollection for each input file have no sens. A full sampleMetadata.tsv file would be nice as output. We have to change the wrapper. @lecorguille @Lain-inrae what do you think ?

Yann

abretaud commented 1 year ago

I think the original problem is fixed

mvdbeek commented 12 months ago

Found a workaround that seems to work, changing the tool filter to this:

sorry, just noticed this in https://github.com/workflow4metabolomics/tools-metabolomics/pull/251. That sure doesn't look right

mvdbeek commented 12 months ago

https://github.com/workflow4metabolomics/tools-metabolomics/blob/master/tools/msnbase_readmsdata/msnbase_readmsdata.xml#L38 doesn't work anymore, could you let me know where I can find this ?

mvdbeek commented 12 months ago

(ideally the version that is broken, not the one with the workaround)

abretaud commented 12 months ago

I think this link is the last version before the filter was removed: https://github.com/workflow4metabolomics/tools-metabolomics/blob/0caf5dc749efded521253db59a5446d939b8331a/tools/xcms/msnbase_readmsdata.xml#L38

hechth commented 3 weeks ago

@mvdbeek @abretaud @yguitton can this be closed?

mvdbeek commented 3 weeks ago

Probably the collection wasn't finalized yet ? I think we did some work on that in the meantime. I'm not sure this is really resolved, but I guess we'll get a new issue if it re-appears.