qbic-pipelines / querynator

MIT License
1 stars 6 forks source link

create-report: wildtype in cgi results triggers index error #20

Closed HomoPolyethylen closed 3 months ago

HomoPolyethylen commented 4 months ago

Description of the bug

the input used, was the result of a variantMTB call on the variantmtb test dataset.

querynator create-report suddenly fails and throws index error. I did not have a problem with the same data before.

error is thrown in combine_cgi::get_all_alterations:

alteration_links = [j.split(")")[0] for j in [i.split("(")[1] for i in row["Alterations"].split(", ")]]

expected row["Alterations"]: EGFR (P546S), EGFR (G598V), EGFR (E690K), EGFR (S768I) unexpected: PDGFRA wildtype -> throws index error as index 1 does not exist for 1-element-list.

cgi_results/biomarkers.tsv: lines 476-479

``` input01 EGFR (P546S), EGFR (G598V), EGFR (E690K), EGFR (S768I) EGFR (T790M) Afatinib (ERBB2 inhibitor&EGFR inhibitor 2nd gen) Lung Resistant A YES NCCN only alteration type L input01 PTEN (R233Q) PTEN biallelic inactivation Panitumumab (EGFR mAb inhibitor) Colorectal adenocarcinoma Resistant C YES Caris molecular intelligence only alteration type COREAD input01 FLT3 (D835E) FLT3 (D835,Y842) Ponatinib (BCR-ABL inhibitor 3rd gen&Pan-TK inhibitor) Acute myeloid leukemia Resistant D YES PMID:23430109 complete AML input01 PDGFRA wildtype PDGFRA wildtype Dasatinib (BCR-ABL inhibitor 2nd gen) Gastrointestinal stromal Responsive D YES PMID:16397263 complete GIST ```

why did this bug occur now?

Command used and terminal output

(querynator) [zxmgc83@thanos 31cbe1cd66c3929ced7fadb4e571e7]$ querynator create-report \
    --cgi_path test_sample_cgi \
    --civic_path test_sample_civic \
    --outdir test_sample_report

                                           __ 
  ____ ___  _____  _______  ______  ____ _/ /_____  _____
 / __ `/ / / / _ \/ ___/ / / / __ \/ __ `/ __/ __ \/ ___/
/ /_/ / /_/ /  __/ /  / /_/ / / / / /_/ / /_/ /_/ / /
\__, /\__,_/\___/_/   \__, /_/ /_/\__,_/\__/\____/_/
  /_/                /____/

2024-02-29 14:13:33,434 - Querynator - INFO - Start
2024-02-29 14:13:33,435 - Querynator - INFO - Combining CIViC & VEP
2024-02-29 14:13:33,453 - Querynator - INFO - Combining CGI & VEP
Traceback (most recent call last):
  File "/home-link/zxmgc83/miniconda3/envs/querynator/bin/querynator", line 10, in <module>
    sys.exit(run_querynator())
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/querynator/__main__.py", line 277, in run_querynator
    querynator_cli()
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/querynator/__main__.py", line 485, in create_report
    combine_cgi(cgi_path, report_dir, logger)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/querynator/report_scripts/combine_cgi.py", line 338, in combine_cgi
    biomarkers_df = link_biomarkers(biomarkers_df)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/querynator/report_scripts/combine_cgi.py", line 276, in link_biomarkers
    biomarkers_df["alterations_link"] = biomarkers_df.apply(lambda x: get_all_alterations(x), axis=1)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/pandas/core/frame.py", line 9555, in apply
    return op.apply().__finalize__(self, method="apply")
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/pandas/core/apply.py", line 746, in apply
    return self.apply_standard()
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/pandas/core/apply.py", line 873, in apply_standard
    results, res_index = self.apply_series_generator()
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/pandas/core/apply.py", line 889, in apply_series_generator
    results[i] = self.f(v)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/querynator/report_scripts/combine_cgi.py", line 276, in <lambda>
    biomarkers_df["alterations_link"] = biomarkers_df.apply(lambda x: get_all_alterations(x), axis=1)
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/querynator/report_scripts/combine_cgi.py", line 261, in get_all_alterations
    alteration_links = [j.split(")")[0] for j in [i.split("(")[1] for i in row["Alterations"].split(", ")]]
  File "/home-link/zxmgc83/miniconda3/envs/querynator/lib/python3.10/site-packages/querynator/report_scripts/combine_cgi.py", line 261, in <listcomp>
    alteration_links = [j.split(")")[0] for j in [i.split("(")[1] for i in row["Alterations"].split(", ")]]
IndexError: list index out of range

System information

python=3.10

hardware: replicated on local machine as well as on server

local uname -a

``` Linux rechenmaschine02 5.10.0-1057-oem #61-Ubuntu SMP Thu Jan 13 15:06:11 UTC 2022 x86_64 GNU/Linux ```

server uname -a

``` Linux thanos.am10.uni-tubingen.de 5.14.0-362.18.1.el9_3.0.1.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Feb 11 13:49:23 UTC 2024 x86_64 GNU/Linux ```

HomoPolyethylen commented 3 months ago

open todos: