bigbio / quantms

Quantitative mass spectrometry workflow. Currently supports proteomics experiments with complex experimental designs for DDA-LFQ, DDA-Isobaric and DIA-LFQ quantification.
https://quantms.org
MIT License
28 stars 35 forks source link

diann_convert fails to produce out_msstats when there is no difference in the samples #384

Closed jpfeuffer closed 1 month ago

jpfeuffer commented 2 months ago

Description of the bug

e.g. when not specifying any factor column. It should either just write the file with all the same condition, or fail with a nicer error.

Command used and terminal output

Traceback (most recent call last):
    File "/$user/.nextflow/pipelines/8a22de65/bigbio/quantms/bin/diann_convert.py", line 1377, in <module>
      cli()
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
      return self.main(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1078, in main
      rv = self.invoke(ctx)
           ^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
      return _process_result(sub_ctx.command.invoke(sub_ctx))
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
      return ctx.invoke(self.callback, **ctx.params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/core.py", line 783, in invoke
      return __callback(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
      return f(get_current_context(), *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/$user/.nextflow/pipelines/8a22de65/bigbio/quantms/bin/diann_convert.py", line 114, in convert
      out_msstats = out_msstats.merge(
                    ^^^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/pandas/core/frame.py", line 10490, in merge
      return merge(
             ^^^^^^
    File "/usr/local/lib/python3.11/site-packages/pandas/core/reshape/merge.py", line 169, in merge
      op = _MergeOperation(
           ^^^^^^^^^^^^^^^^
    File "/usr/local/lib/python3.11/site-packages/pandas/core/reshape/merge.py", line 810, in __init__
      self._validate_validate_kwd(validate)
    File "/usr/local/lib/python3.11/site-packages/pandas/core/reshape/merge.py", line 1635, in _validate_validate_kwd
      raise MergeError(
  pandas.errors.MergeError: Merge keys are not unique in right dataset; not a many-to-one merge

Relevant files

No response

System information

No response

ypriverol commented 1 month ago

@jpfeuffer can you share the SDRF. I want to validate the SDRF in a better way in the sdrf-pipelines tool.

jpfeuffer commented 1 month ago

No, unfortunately I cannot. Maybe just remove the factor value column from our test_dia dataset and try to run the pipeline with msstats.

https://raw.githubusercontent.com/nf-core/test-datasets/quantms/testdata/dia_ci/PXD026600.sdrf.tsv

ypriverol commented 1 month ago

This should be fixed in the latest PR with sdrf-pipelines 0.0.28

jpfeuffer commented 1 month ago

What is the solution now? Always fail if the factor values are not there? Or only fail if factor values are missing and msstats is activated?