epigen / enrichment_analysis

A Snakemake workflow for performing genomic region set and gene set enrichment analyses using LOLA, GREAT, GSEApy, pycisTarget and RcisTarget.
https://epigen.github.io/enrichment_analysis/
MIT License
22 stars 1 forks source link

raise KeyError(key) from err KeyError: 'qValue' Error in rule aggregate: #7

Closed sandragold closed 9 months ago

sandragold commented 1 year ago

Hi!

That's me again :smile: Sorry to bother you but I'm testing the latest version of your pipeline on many different input files and for the different ones I get new error message:

python -c "import sys; print('.'.join(map(str, sys.version_info[:2])))"
Activating conda environment: .snakemake/conda/db2069d06c67e34fe0d5f2324aefa1c9_
python /mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/scripts/tmpn224jb2o.aggregate.py
Activating conda environment: .snakemake/conda/db2069d06c67e34fe0d5f2324aefa1c9_
Traceback (most recent call last):
  File "/mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/conda/db2069d06c67e34fe0d5f2324aefa1c9_/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2895, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'qValue'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/scripts/tmpn224jb2o.aggregate.py", line 55, in <module>
    sig_terms = result_df.loc[result_df[adjp_col]<adjp_th, term_col].unique()
  File "/mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/conda/db2069d06c67e34fe0d5f2324aefa1c9_/lib/python3.9/site-packages/pandas/core/frame.py", line 2906, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/conda/db2069d06c67e34fe0d5f2324aefa1c9_/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc
    raise KeyError(key) from err
KeyError: 'qValue'
[Mon Jun  5 22:36:57 2023]
Error in rule aggregate:
    jobid: 73
    output: /mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/results/enrichment_analysis/mysterySets/LOLA/LOLACore/mysterySets_LOLACore_all.csv, /mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/results/enrichment_analysis/mysterySets/LOLA/LOLACore/mysterySets_LOLACore_sig.csv
    log: logs/rules/aggregate_mysterySets_LOLA_LOLACore.log (check log file(s) for error message)
    conda-env: /mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/conda/db2069d06c67e34fe0d5f2324aefa1c9_

RuleException:
CalledProcessErrorin line 19 of /mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/workflow/rules/aggregate.smk:
Command 'source /mnt/polkanowa2/programs/bin/activate '/mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/conda/db2069d06c67e34fe0d5f2324aefa1c9_'; set -eo pipefail; python /mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/.snakemake/scripts/tmpn224jb2o.aggregate.py' returned non-zero exit status 1.
  File "/mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/workflow/rules/aggregate.smk", line 19, in __rule_aggregate
  File "/mnt/polkanowa2/programs/lib/python3.9/concurrent/futures/thread.py", line 58, in run
Removing output files of failed job aggregate since they might be corrupted:
/mnt/polkanowa2/Cytometh_Bartosz/enrichment_analysis/enrichment_analysis/results/enrichment_analysis/mysterySets/LOLA/LOLACore/mysterySets_LOLACore_all.csv
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2023-06-05T222043.715874.snakemake.log

If you need anything else for your testing purposes, please let me know.

Thanks for your help as always! :) Sandra

sreichl commented 1 year ago

Hi, my best guess is that the results are actually empty and this leads to the problem of an empty dataframe in the aggregation script. Please check if there are individual results to be aggregated in this "group".

Cheers, Stephan

sandragold commented 1 year ago

Hi,

Do you know why the table below looks like that (why no description and pvalues were written)?

image

Best regards, Sandra

sreichl commented 1 year ago

Hi Sandra, I would require a little more context. I guess I am looking at results from LOLA?

sandragold commented 1 year ago

Yes, this table is from LOLACore. But LOLAJaspar also didn't work as expected here.

sreichl commented 1 year ago

Can you provide more of the result file? Did GREAT return results?

sreichl commented 1 year ago

One more question: is the test/example data running correctly? i.e., does it return LOLA results? If yes, then it has probably something to do with your input data.