epigen / LIQUORICE

A tool to detect tissue- and cancer- specific epigenetic signatures in WGS data of liquid biopsies
https://liquorice.readthedocs.io
GNU General Public License v3.0
9 stars 5 forks source link

summary tool error #5

Closed DucoG closed 1 year ago

DucoG commented 1 year ago

I get an error when I run the summary tool: $python ~/source/LIQUORICE/liquorice/LIQUORICE_summary.py --dirname data/ --control_name_list PGDX25573P_WGS_hg19_mrk PGDX25574P_WGS_hg19_mrk PGDX25575P_WGS_hg19_mrk

Traceback (most recent call last):
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Total dip area (AOC combined model)'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 892, in <module>
    sys.exit(main())
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 844, in main
    prediction_interval_alpha=args.prediction_interval_alpha)
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 681, in create_summary_table_LIQUORICE
    lambda x: zscore_to_control(x, control_summary_df, "Total dip area (AOC combined model)"), axis=1),
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/frame.py", line 7552, in apply
    return op.get_result()
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/apply.py", line 180, in get_result
    return self.apply_standard()
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/apply.py", line 271, in apply_standard
    results, res_index = self.apply_series_generator()
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/apply.py", line 300, in apply_series_generator
    results[i] = self.f(v)
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 681, in <lambda>
    lambda x: zscore_to_control(x, control_summary_df, "Total dip area (AOC combined model)"), axis=1),
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 203, in zscore_to_control
    mean_ctrl = control_df[control_df["region-set"] == row["region-set"]].mean()[col]
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/series.py", line 882, in __getitem__
    return self._get_value(key)
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/series.py", line 989, in _get_value
    loc = self.index.get_loc(label)
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc
    raise KeyError(key) from err
KeyError: 'Total dip area (AOC combined model)'

running just python ~/source/LIQUORICE/liquorice/LIQUORICE_summary.py --dirname data/

results in:

Traceback (most recent call last):
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 892, in <module>
    sys.exit(main())
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 844, in main
    prediction_interval_alpha=args.prediction_interval_alpha)
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 681, in create_summary_table_LIQUORICE
    lambda x: zscore_to_control(x, control_summary_df, "Total dip area (AOC combined model)"), axis=1),
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/frame.py", line 7552, in apply
    return op.get_result()
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/apply.py", line 180, in get_result
    return self.apply_standard()
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/apply.py", line 271, in apply_standard
    results, res_index = self.apply_series_generator()
  File "/home/d.gaillard/miniconda3/envs/liq/lib/python3.7/site-packages/pandas/core/apply.py", line 300, in apply_series_generator
    results[i] = self.f(v)
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 681, in <lambda>
    lambda x: zscore_to_control(x, control_summary_df, "Total dip area (AOC combined model)"), axis=1),
  File "/home/d.gaillard/source/LIQUORICE/liquorice/LIQUORICE_summary.py", line 205, in zscore_to_control
    return round((row[col] - mean_ctrl) / std_ctrl, 2)
numpy.core._exceptions.UFuncTypeError: ufunc 'subtract' did not contain a loop with signature matching types (dtype('<U496'), dtype('float64')) -> None

seems like row[col] is interpreted as a string: print(row)

G1_amplitude                                                                    -7.19491
G1_sigma                                                                             149
G2_amplitude                                                                    -23.3076
G2_sigma                                                                             758
G3_amplitude                                                                     180.654
G3_sigma                                                                         6078.99
Bayesian Information Criterion                                                  -296.911
Total dip depth                                                                0.0196754
Intercept                                                                    -0.00644268
Total dip area (AOC combined model)    [  19.51176568   28.55679095   40.67897386   5...
sample                                                           PGDX25842P_WGS_hg19_mrk
region-set                                                                universal_DHSs
is control                                                                            no
Name: 0, dtype: object

print(row[col]) print(type(row[col]))

[  19.51176568   28.55679095   40.67897386   56.39989979   76.10872241
99.96277123  127.78794887  158.99712356  192.5468384   226.95056729
260.35978384  290.70761345  314.82301263  292.89430369   75.54517386
-517.3797976  -543.54440912 -573.5379296  -543.54440912 -517.3797976
75.54517386  292.89430369  314.82301263  290.70761345  260.35978384
226.95056729  192.5468384   158.99712356  127.78794887   99.96277123
76.10872241   56.39989979   40.67897386   28.55679095   19.51176568]
<class 'str'>
DucoG commented 1 year ago

No more issues with correct environment!