KarchinLab / 2020plus

Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests
Apache License 2.0
48 stars 17 forks source link

Error in job simMAf : Called processor error : #12

Open orochimarupap opened 5 years ago

orochimarupap commented 5 years ago

We've gotten through the quick start and have trained our classifier. Now, while trying to run 2020plus an error is thrown, goes as follows:

Command: /anaconda3/envs/2020plus/bin/mut_annotate --log-level=INFO -b data//snvboxGenes.bed -i data//snvboxGenes.fa -c 1.5 -m data/bladder.txt -p 0 -n 1 --maf --seed=378 -r 3 --unique -o output_bladder/simulated_summary/chasm_sim_maf9.txt There were 832 indels identified. Kept 33771 mutations after droping mutations with missing information (Droped: 0) Dropped 832 mutations after only keeping Missense_Mutation, Silent, Nonsense_Mutation, Splice_Site, Nonstop_Mutation, Translation_Start_Site. Indels are processed separately. Dropped 182 mutations after only keeping valid SNVs Dropped 0 mutations when removing duplicates Working on chromosome: chr1 . . . 'N' Traceback (most recent call last): File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/utils.py", line 131, in wrapper result = f(*args, *kwds) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 209, in singleprocess_permutation drop_silent=opts['drop_silent']) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/permutation.py", line 732, in maf_permutation num_permutations) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/sequence_context.py", line 226, in random_pos pos_array = self.random_context_pos(n, num_permutations, contxt) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/sequence_context.py", line 204, in random_context_pos random_pos = self.prng_dict[context].choice(available_pos, (num_permutations, num)) KeyError: 'N' Traceback (most recent call last): File "/anaconda3/envs/2020plus/bin/mut_annotate", line 10, in sys.exit(cli_main()) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 432, in cli_main main(opts) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 417, in main multiprocess_permutation(bed_dict, mut_df, opts, indel_df) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 136, in multiprocess_permutation chrom_results = singleprocess_permutation(info) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/utils.py", line 131, in wrapper result = f(args, *kwds) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 209, in singleprocess_permutation drop_silent=opts['drop_silent']) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/permutation.py", line 732, in maf_permutation num_permutations) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/sequence_context.py", line 226, in random_pos pos_array = self.random_context_pos(n, num_permutations, contxt) File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/sequence_context.py", line 204, in random_context_pos random_pos = self.prng_dict[context].choice(available_pos, (num_permutations, num)) KeyError: 'N' Error in job simMaf while creating output file output_bladder/simulated_summary/chasm_sim_maf9.txt. RuleException: CalledProcessError in line 135 of /Users/josephnovak/Desktop/2020plus-master/Snakefile: Command 'mut_annotate --log-level=INFO -b data//snvboxGenes.bed -i data//snvboxGenes.fa -c 1.5 -m data/bladder.txt -p 0 -n 1 --maf --seed=$((942)) -r 3 --unique -o output_bladder/simulated_summary/chasm_sim_maf9.txt' returned non-zero exit status 1. File "/Users/josephnovak/Desktop/2020plus-master/Snakefile", line 135, in __rule_simMaf File "/anaconda3/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run Removing output files of failed job simMaf since they might be corrupted: output_bladder/simulated_summary/chasm_sim_maf9.txt Will exit after finishing currently running jobs.

ctokheim commented 5 years ago

Can you try the latest version of probabilistic2020 (v1.2.3)? I just uploaded and you should be able to install it via pip. I think it should fix the issue.

orochimarupap commented 5 years ago

By uninstalling and reinstalling probabilistic2020 I found an error building wheel for probabilistic2020

ctokheim commented 5 years ago

Can you paste what the problem is? I do not get any installation problems on a clean python version.

orochimarupap commented 4 years ago

Okay, after going back and installing a new instance of 2020plus with all its requirements we have tried to run the prediction again. Still an error is thrown and it seems similar to the one before:

Provided cores: 1 Rules claiming more threads will be scaled down. Job counts: count jobs 1 features 1 finishSim 1 og 1 predict_test 1 pretrained_predict 10 simFeatures 10 simMaf 10 simOg 10 simSummary 10 simTsg 1 summary 1 tsg 57

rule simMaf: input: data_/test.hg19_2.txt output: output_test/simulated_summary/chasm_sim_maf7.txt jobid: 53 wildcards: iter=7

mutannotate --log-level=INFO -b data//snvboxGenes.bed -i data//snvboxGenes.fa -c 1.5 -m data/test.hg19_2.txt -p 0 -n 1 --maf --seed=$((742)) -r 3 --unique -o output_test/simulated_summary/chasm_sim_maf7.txt Traceback (most recent call last): File "/anaconda3/envs/2020plus/bin/mut_annotate", line 6, in from prob2020.console.annotate import cli_main File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/console/annotate.py", line 12, in import prob2020.cython.cutils as cutils File "prob2020/cython/cutils.pyx", line 8, in init prob2020.cython.cutils File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/scores.py", line 4, in import prob2020.python.mymath as mymath File "/anaconda3/envs/2020plus/lib/python3.6/site-packages/prob2020/python/mymath.py", line 2, in from scipy.misc import logsumexp ImportError: cannot import name 'logsumexp' Error in job simMaf while creating output file output_test/simulated_summary/chasm_sim_maf7.txt. RuleException: CalledProcessError in line 135 of /Users/josephnovak/Desktop/2020plus-master/Snakefile: Command 'mutannotate --log-level=INFO -b data//snvboxGenes.bed -i data//snvboxGenes.fa -c 1.5 -m data/test.hg19_2.txt -p 0 -n 1 --maf --seed=$((742)) -r 3 --unique -o output_test/simulated_summary/chasm_sim_maf7.txt' returned non-zero exit status 1. File "/Users/josephnovak/Desktop/2020plus-master/Snakefile", line 135, in __rule_simMaf File "/anaconda3/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run Will exit after finishing currently running jobs. Exiting because a job execution failed. Look above for error message

ctokheim commented 4 years ago

This is an issue with scipy, which changed its api. Try a scipy version below 1.0.0.

orochimarupap commented 4 years ago

Shouldn't the correct modules and versions be installed into the environment by using environment_python.yml? I find that version numbers of required modules are very different from those listed in the requirements_dev file. Downgrading to versions listed in the requirements file is proving to be difficult as there are many dependancies

ctokheim commented 4 years ago

Do not use requirements_dev. That is an old file from original development