statisticalbiotechnology / triqler

The triqler (TRansparent Identification-Quantification-linked Error Rates)'s source and example code
Apache License 2.0
19 stars 9 forks source link

Expected Run name in DIANN converter #20

Closed buijt closed 1 year ago

buijt commented 1 year ago

Hello. I hope this finds you well.

I am trying to import the search result file "DIANN_report.tsv" into triqler, but I am encountering this error:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\buij4\AppData\Local\miniconda3\envs\proteinquant\Lib\site-packages\triqler\convert\diann.py", line 69, in <module>
    main()
  File "C:\Users\buij4\AppData\Local\miniconda3\envs\proteinquant\Lib\site-packages\triqler\convert\diann.py", line 26, in main
    diann_to_triqler(args.in_file, args.out_file, params)
  File "C:\Users\buij4\AppData\Local\miniconda3\envs\proteinquant\Lib\site-packages\triqler\convert\diann.py", line 57, in diann_to_triqler
    df["condition"] = df["Run"].map(condition_mapper)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\buij4\AppData\Local\miniconda3\envs\proteinquant\Lib\site-packages\pandas\core\series.py", line 4398, in map
    new_values = self._map_values(arg, na_action=na_action)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\buij4\AppData\Local\miniconda3\envs\proteinquant\Lib\site-packages\pandas\core\base.py", line 924, in _map_values
    new_values = map_f(values, mapper)
                 ^^^^^^^^^^^^^^^^^^^^^
  File "pandas\_libs\lib.pyx", line 2834, in pandas._libs.lib.map_infer
  File "C:\Users\buij4\AppData\Local\miniconda3\envs\proteinquant\Lib\site-packages\triqler\convert\diann.py", line 54, in <lambda>
    condition_mapper = lambda x : x.split("_")[8]
                                  ~~~~~~~~~~~~^^^

Looking at the source code, I see that condition_mapper assumes there are 8 underscore-separated tokens in the Run name. Currently, the Run name in my DIANN_report.tsv file is simply the raw data file name, which does not separate into 8 underscore-separated tokens.

Could you advise me on what the converter expects the Run name to be in order to use triqler's converter for DIANN output files? I looked over the user manual and in the examples, the Run name can be set arbitrarily, so I believe I must be overlooking something very simple.

Thank you.

MatthewThe commented 1 year ago

Sorry, this should indeed not have been hard coded.

I will change it to the functionality we use in the other converters where users have to provide a mapping file from run names to conditions and runs.