KarchinLab / 2020plus

Classifies genes as an oncogene, tumor suppressor gene, or as a non-driver gene by using Random Forests
http://2020plus.readthedocs.org
Apache License 2.0
49 stars 17 forks source link

python: can't open file 'features': #31

Open complexgenome opened 3 months ago

complexgenome commented 3 months ago

Hi there,

I am using example bladder data to test and get going with 20/20+ I've data folder that has

$ ls data/
snvboxGenes.bed  snvboxGenes.fa  snvboxGenes.fa.fai

I run it as: snakemake -s 2020plus-1.2.3/Snakefile pretrained_predict -p --cores 1 --config mutations="bladder.txt" output_dir="output_bladder" trained_classifier="2020plus_10k.Rdata"

Error I get:

python which 2020plus.py features -s output_bladder/summary.txt --tsg-test output_bladder/tsg.txt -og-test output_bladder/oncogene.txt -o output_bladder/features.txt python: can't open file 'features': [Errno 2] No such file or directory Error in rule features: jobid: 2 output: output_bladder/features.txt

RuleException: CalledProcessError in line 282 of /mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/Snakefile: Command ' set -euo pipefail; python which 2020plus.py features -s output_bladder/summary.txt --tsg-test output_bladder/tsg.txt -og-test output_bladder/oncogene.txt -o output_bladder/features.txt ' returned non-zero exit status 2. File "/mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/Snakefile", line 282, in __rule_features File "/data1/software/miniconda/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /mnt/data1/users/sanjeev/drivers/2020/.snakemake/log/2024-08-06T171147.906096.snakemake.log

Which features file is it looking and where do I put?

There are files generated in output folder:

$ ls output_bladder/
oncogene.txt  simulated_summary  summary.txt  tsg.txt

Please let me know if any other information is required.

complexgenome commented 3 months ago

@ctokheim Any inputs?

ctokheim commented 3 months ago

This looks like 2020plus.py is not in your file path. If you type in the terminal which 2020plus.py does it print the full file path to 2020plus.py script? I'm guessing the error results because which 2020plus.py returns an empty string, and therefore python thinks the text right afterward is the executable that it should run. "features" is actually a subcommand of the command line interface for the 2020plus.py script.

complexgenome commented 3 months ago

@ctokheim Thanks for your response. I've 2020plus python in PATH as:


$ which 2020plus.py
/data1/users/sanjeev/drivers/2020/2020plus-1.2.3//2020plus.py

Still I run into the error, I re run and the error remains same. Please see attached error log from snakemake.

2024-08-07T162610.850260.snakemake.log

It's run at 16:26 (4:46PM).

ctokheim commented 3 months ago

Could you copy one of the commands that failed regarding the 2020plus.py features command and run that directly in your terminal? It looks like the error message that caused the failure is being suppressed, and running it outside of snakemake should give you a full python traceback of what happened.

complexgenome commented 3 months ago

@ctokheim I'm not sure if I understood you.

python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary8.txt --tsg-test output_bladder/simulated_summary/tsg_sim8.txt -og-test output_bladder/simulated_summary/oncogene_sim8.txt   -o output_bladder/simulated_summary/simulated_features8.txt 
Version: 1.2.3
Command: /data1/users/sanjeev/drivers/2020/2020plus-1.2.3//2020plus.py features -s output_bladder/simulated_summary/chasm_sim_summary8.txt --tsg-test output_bladder/simulated_summary/tsg_sim8.txt -og-test output_bladder/simulated_summary/oncogene_sim8.txt -o output_bladder/simulated_summary/simulated_features8.txt
****************************************
AN ERROR HAS OCCURRED: check the log file
****************************************
Type: <class 'TypeError'>
Exception: use() got an unexpected keyword argument 'warn'
Traceback:
   File "/data1/users/sanjeev/drivers/2020/2020plus-1.2.3//2020plus.py", line 263, in <module>
    import src.classify.python.classifier
  File "/mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/src/classify/python/classifier.py", line 12, in <module>
    import src.classify.python.plot_data as plot_data
  File "/mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/src/classify/python/plot_data.py", line 1, in <module>
    import src.utils.python.plot as myplt
  File "/mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/src/utils/python/plot.py", line 8, in <module>
    matplotlib.use('agg', warn=False)

I found that this could be due to higher matplot lib version. https://stackoverflow.com/a/63065060/2740831

I downgraded it and ran into another set of errors:

pip install matplotlib==3.2

 python `which 2020plus.py` features   -s output_bladder/simulated_summary/chasm_sim_summary9.txt --tsg-test output_bladder/simulated_summar
y/tsg_sim9.txt -og-test output_bladder/simulated_summary/oncogene_sim9.txt   -o output_bladder/simulated_summary/simulated_features9.txt ' died with <Signals.SIGFPE: 8>.
  File "/mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/Snakefile", line 202, in __rule_simFeatures
  File "/data1/software/miniconda/envs/2020plus/lib/python3.6/concurrent/futures/thread.py", line 56, in run

I then again moved it back to higher matplot lib pip install matplotlib==3.3.2

ctokheim commented 3 months ago

Yes, I would avoid matplotlib 3.3.2. Can you try creating a new conda environment with the following yaml file? 2020plus_environment_python.yml.zip

complexgenome commented 3 months ago

Hi @ctokheim I created new env, when I run python 2020plus.py I run into a different error now:

$ python which 2020plus.py

Version: 1.2.3 Command: /data1/users/sanjeev/drivers/2020/2020plus-1.2.3/2020plus.py


AN ERROR HAS OCCURRED: check the log file


Type: <class 'ImportError'> Exception: libicuuc.so.54: cannot open shared object file: No such file or directory Traceback: File "/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/2020plus.py", line 263, in import src.classify.python.classifier File "/mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/src/classify/python/classifier.py", line 3, in from src.classify.python.r_random_forest_clf import RRandomForest File "/mnt/data1/users/sanjeev/drivers/2020/2020plus-1.2.3/src/classify/python/r_random_forest_clf.py", line 4, in import rpy2.robjects as ro File "/data1/software/miniconda/envs/2020plus/lib/python3.6/site-packages/rpy2/robjects/init.py", line 16, in import rpy2.rinterface as rinterface File "/data1/software/miniconda/envs/2020plus/lib/python3.6/site-packages/rpy2/rinterface/init.py", line 92, in from rpy2.rinterface._rinterface import (baseenv,

Yes, I would avoid matplotlib 3.3.2. Can you try creating a new conda environment with the following yaml file? 2020plus_environment_python.yml.zip

ctokheim commented 3 months ago

This is likely an issue with installing conflicting versions of R and rpy2, please see this previous issue: https://github.com/KarchinLab/2020plus/issues/24

complexgenome commented 3 months ago

@ctokheim

This is likely an issue with installing conflicting versions of R and rpy2, please see this previous issue: #24

It then links to another github issue and so many solutions yet I don't know which will work. They worked before you provided a new .yaml config file.