hillerlab / TOGA

TOGA (Tool to infer Orthologs from Genome Alignments): implements a novel paradigm to infer orthologous genes. TOGA integrates gene annotation, inferring orthologs and classifying genes as intact or lost.
MIT License
160 stars 23 forks source link

Error in Final Test #115

Closed simone-says closed 9 months ago

simone-says commented 12 months ago

I've been trying to run the "final test" with the human and mouse assembly to check my installation and it seems to fail somewhere in step 3, everything looks normal until that point:

#### STEP 3: Merge step 2 output

Reading /scratch/smg655/TOGA/test/temp/toga_filt_ref_annot.bed
merge_chains_output: got data for 3674 transcripts
merge_chains_output: Loading the results...
merge_chains_output: There are 0 result files to combine
merge_chains_output: got 0 keys in chain_genes_data
merge_chains_output: got 0 keys in chain_raw_data
merge_chains_output: There were 0 transcript lines and 0 chain lines
merge_chains_output: chain_genes_data dict reverted, there are 0 keys now
merge_chains_output: Combining the data...
merge_chains_output: got combined dict with 0 keys
merge_chains_output: Writing output to /scratch/smg655/TOGA/test/temp/chain_results_df.tsv
merge_chains_output: total runtime: 0:00:53.312700

#### STEP 4: Classify chains using gradient boosting model

Classifying chains
classify_chains: loaded dataframe of size 0
classify_chains: total number of transcripts: 0
classify_chains: 0 rows with spanning chains
classify_chains: filtered dataset contains 0 records
classify_chains: omputing additional features...
classify_chains: WARNING! The final df for classification is empty
classify_chains: df for single-exon model contains 0 records
classify_chains: df for multi-exon model contains 0 records
classify_chains: loading models at /scratch/smg655/TOGA/./models/se_model.dat (SE) and /scratch/smg655/TOGA/./models/me_model.dat (ME)
classify_chains: applying models to SE and ME datasets...
classify_chains: applying -1.0 score to the spanning chains
classify_chains: applying -2.0 score to the processed pseudogene alignments
classify_chains: number of processed pseudogene alignments: 0
classify_chains: arranging the final output
classify_chains: classification result stats:
* orthologs: 0
* paralogs: 0
* spanning chains: 0
* processed pseudogenes: 0
classify_chains: using 0.5 as a threshold to separate orthologs from paralogs
classify_chains: combining results for 0 individual transcripts
classify_chains: saving the classification to /scratch/smg655/TOGA/test/temp/trans_to_chain_classes.tsv
classify_chains: found no classifiable chains for 0 transcripts
classify_chains: saving these transcripts to: /scratch/smg655/TOGA/test/temp/rejected/classify_chains_rejected.txt
Chain results file /scratch/smg655/TOGA/test/temp/chain_results_df.tsv is empty! Abort.
Traceback (most recent call last):
  File "/scratch/smg655/TOGA/./toga.py", line 1683, in <module>
    main()
  File "/scratch/smg655/TOGA/./toga.py", line 1679, in main
    toga_manager.run()
  File "/scratch/smg655/TOGA/./toga.py", line 592, in run
    self.__classify_chains()
  File "/scratch/smg655/TOGA/./toga.py", line 810, in __classify_chains
    check_chains_classified(self.chain_results_df)
  File "/scratch/smg655/TOGA/modules/sanity_check_functions.py", line 169, in check_chains_classified
    raise ValueError(msg)
ValueError: Chain results file /scratch/smg655/TOGA/test/temp/chain_results_df.tsv is empty! Abort.

The "merge chain" steps do not output anything. Am I missing something here? I'm running this line of code on a Slurm HPC: ./toga.py test_input/hg38.mm10.chr11.chain test_input/hg38.genCode27.chr11.bed hg38.2bit mm10.2bit --kt --pn test -i supply/hg38.wgEncodeGencodeCompV34.isoforms.txt --nc nextflow_config_files --cb 3,5 --cjn 500 --u12 supply/hg38.U12sites.tsv --ms

kirilenkobm commented 12 months ago

Hi @simone-says

thank you for reporting this. Could you please attach the full .log file? Feels like there is an issue with step 2 (or even 1?)

simone-says commented 12 months ago
#### Initiating TOGA class ####
# python interpreter path: /packages/python/3.10.8-jvohbnx/bin/python3
# python interpreter version: 3.10.8 (main, Apr  5 2023, 11:17:20) [GCC 8.5.0 20210514 (Red Hat 8.5.0-15)]
Version 1.1.7.dev
Commit: 399a36bc408fb36462eabdc5978f70a114370253
Branch: master

# Python package versions
* twobitreader: 3.1.7
* networkx: 3.2.1
* pandas: 2.1.2
* numpy: 1.26.1
* xgboost: 2.0.1
* scikit-learn: 1.3.2
* joblib: 1.3.2
* h5py: 3.10.0
Calling cmd:
/scratch/smg655/TOGA/./modules/chain_score_filter test_input/hg38.mm10.chr11.chain 15000 > /scratch/smg655/TOGA/test/temp/genome_alignment.chain

Command finished with exit code 0.
Writing isoforms data for 3674 transcripts.
Found 455 sequences in /scratch/smg655/TOGA/hg38.2bit
Found 455 sequences in /scratch/smg655/TOGA/hg38.2bit
Found 66 sequences in /scratch/smg655/TOGA/mm10.2bit
Saving output to /scratch/smg655/TOGA/test
Arguments stored in /scratch/smg655/TOGA/test/project_args.json

#### STEP 0: making chain and bed file indexes

Started chain indexing...
chain_bst_index: indexing 79183 chains
chain_bst_index: Saved chain /scratch/smg655/TOGA/test/temp/genome_alignment.chain index to /scratch/smg655/TOGA/test/temp/genome_alignment.bst
Started bed file indexing...
bed_hdf5_index: indexed 3674 transcripts

#### STEP 1: Generate extract chain features jobs

Calling cmd:
/scratch/smg655/TOGA/./split_chain_jobs.py /scratch/smg655/TOGA/test/temp/genome_alignment.chain /scratch/smg655/TOGA/test/temp/toga_filt_ref_annot.bed /scratch/smg655/TOGA/test/temp/toga_filt_ref_annot.hdf5 --log_file /scratch/smg655/TOGA/test/toga_2023_11_08_at_10_39.log --parallel_logs_dir /scratch/smg655/TOGA/test/temp_logs --jobs_num 100 --jobs /scratch/smg655/TOGA/test/temp/chain_classification_jobs --jobs_file /scratch/smg655/TOGA/test/temp/chain_class_jobs_combined --results_dir /scratch/smg655/TOGA/test/temp/chain_classification_results --rejected /scratch/smg655/TOGA/test/temp/rejected/SPLIT_CHAIN_REJ.txt

split_chain_jobs: Use bed file /scratch/smg655/TOGA/test/temp/toga_filt_ref_annot.bed and chain file /scratch/smg655/TOGA/test/temp/genome_alignment.chain
split_chain jobs: the run data overview is:

* vv: False
* jobs: /scratch/smg655/TOGA/test/temp/chain_classification_jobs
* results_dir: /scratch/smg655/TOGA/test/temp/chain_classification_results
* errors_dir: None
* chain_file: /scratch/smg655/TOGA/test/temp/genome_alignment.chain
* bed_file: /scratch/smg655/TOGA/test/temp/toga_filt_ref_annot.bed
* index_file: /scratch/smg655/TOGA/test/temp/genome_alignment.chain_ID_position
* job_size: None
* jobs_num: 100
* bed_index: /scratch/smg655/TOGA/test/temp/toga_filt_ref_annot.hdf5
* jobs_file: /scratch/smg655/TOGA/test/temp/chain_class_jobs_combined
* ref: hg38
* on_cluster: True
split_chain_jobs: searching for intersections between reference transcripts and chains
split_chain_jobs: chains-to-transcripts dict contains 50186 records
split_chain_jobs: skipped 0 transcripts that do not intersect any chain
split_chain_jobs: preparing 50186 commands
split_chain_jobs: command size of 502 for each cluster job
split_chain_jobs: results in 100 cluster jobs
split_chain_jobs: estimated time: 0:00:02.596004
Command finished with exit code 0.

#### STEP 2: Extract chain features: parallel step

Extracting chain features, project name: chain_feats__test_at_1699465214
Project path: /scratch/smg655/TOGA/./nextflow_logs/chain_feats__test_at_1699465214
Selected parallelization strategy: nextflow
Parallel manager: pushing job nextflow /scratch/smg655/TOGA/execute_joblist.nf --joblist /scratch/smg655/TOGA/test/temp/chain_class_jobs_combined -c /scratch/smg655/TOGA/nextflow_config_files/extract_chain_features_config.nf
Logs from individual chain runner jobs are show below

#### STEP 3: Merge step 2 output

Reading /scratch/smg655/TOGA/test/temp/toga_filt_ref_annot.bed
merge_chains_output: got data for 3674 transcripts
merge_chains_output: Loading the results...
merge_chains_output: There are 0 result files to combine
merge_chains_output: got 0 keys in chain_genes_data
merge_chains_output: got 0 keys in chain_raw_data
merge_chains_output: There were 0 transcript lines and 0 chain lines
merge_chains_output: chain_genes_data dict reverted, there are 0 keys now
merge_chains_output: Combining the data...
merge_chains_output: got combined dict with 0 keys
merge_chains_output: Writing output to /scratch/smg655/TOGA/test/temp/chain_results_df.tsv
merge_chains_output: total runtime: 0:00:53.312700

#### STEP 4: Classify chains using gradient boosting model

Classifying chains
classify_chains: loaded dataframe of size 0
classify_chains: total number of transcripts: 0
classify_chains: 0 rows with spanning chains
classify_chains: filtered dataset contains 0 records
classify_chains: omputing additional features...
classify_chains: WARNING! The final df for classification is empty
classify_chains: df for single-exon model contains 0 records
classify_chains: df for multi-exon model contains 0 records
classify_chains: loading models at /scratch/smg655/TOGA/./models/se_model.dat (SE) and /scratch/smg655/TOGA/./models/me_model.dat (ME)
classify_chains: applying models to SE and ME datasets...
classify_chains: applying -1.0 score to the spanning chains
classify_chains: applying -2.0 score to the processed pseudogene alignments
classify_chains: number of processed pseudogene alignments: 0
classify_chains: arranging the final output
classify_chains: classification result stats:
* orthologs: 0
* paralogs: 0
* spanning chains: 0
* processed pseudogenes: 0
classify_chains: using 0.5 as a threshold to separate orthologs from paralogs
classify_chains: combining results for 0 individual transcripts
classify_chains: saving the classification to /scratch/smg655/TOGA/test/temp/trans_to_chain_classes.tsv
classify_chains: found no classifiable chains for 0 transcripts
classify_chains: saving these transcripts to: /scratch/smg655/TOGA/test/temp/rejected/classify_chains_rejected.txt
Chain results file /scratch/smg655/TOGA/test/temp/chain_results_df.tsv is empty! Abort.
Traceback (most recent call last):
  File "/scratch/smg655/TOGA/./toga.py", line 1683, in <module>
    main()
  File "/scratch/smg655/TOGA/./toga.py", line 1679, in main
    toga_manager.run()
  File "/scratch/smg655/TOGA/./toga.py", line 592, in run
    self.__classify_chains()
  File "/scratch/smg655/TOGA/./toga.py", line 810, in __classify_chains
    check_chains_classified(self.chain_results_df)
  File "/scratch/smg655/TOGA/modules/sanity_check_functions.py", line 169, in check_chains_classified
    raise ValueError(msg)
ValueError: Chain results file /scratch/smg655/TOGA/test/temp/chain_results_df.tsv is empty! Abort.
ning-y commented 11 months ago

I encountered a similar error due to my own malformed Nextflow config files. This caused Nextflow to fail in "STEP 2". The failure of Nextflow was not propagated to the TOGA pipeline, so TOGA continued with the failed output, which is indistinguishable from a directory with empty results.

I see that you've used the repository's provided Nextflow config files, but maybe there is some other reason causing Nextflow to fail silently and generate an empty result directory. Could you check the Nextflow logs for "STEP 2"?

chenyuming123456 commented 11 months ago

Hello,I've meet the same problem at final test when I running this code on a Slurm HPC:

./toga.py test_input/hg38.mm10.chr11.chain test_input/hg38.genCode27.chr11.bed ~/hg38.2bit ~/mm10.2bit --kt --pn ~/test -i supply/hg38.wgEncodeGencodeCompV34.isoforms.txt --nc ~/nextflow_config_files/ --cb 10,100 --cjn 500 --u12 supply/hg38.U12sites.tsv --ms

Then I got the error message like simone-says's. There is a chain_feats_test1test_at_1700549452.log document at ~/software/TOGA/nextflow_logs/chain_feats_test1test_at_1700549452/, this is the tail of this .log document:

ERROR ~ Error executing process > 'execute_jobs (12)'

Caused by:
  Failed to submit process to grid scheduler for execution

Command executed:

  sbatch .command.run

Command exit status:
  1

Command output:
  sbatch: error: invalid partition specified: batch
  sbatch: error: Batch job submission failed: Invalid partition name specified

Work dir:
~/software/TOGA/nextflow_logs/work/96/e68fd940b8f7d127448745cb00d0ba

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

This is the full .log file: chain_feats_test1test_at_1700549452.txt. Could you please give me some advice to solve this problem? Thanks a lot.

fuesseler commented 11 months ago

I ran into the same problem as @chenyuming123456 and was able to resolve the issue. By default, the nextflow config files submit to a partition named "batch" ( process.queue = 'batch’)

On my HPC, there is no partition named "bash", so I edited the line above in all three config files in the nextflow_config_files directory to specify a partition that actually exists on my cluster.

simone-says commented 11 months ago

I fixed my issue the same way! You might have to ask your HPC admin the name on your cluster.

Simone Gable

Ph.D. Student | Tollis Lab

Northern Arizona University School of Informatics, Computing & Cyber Systems

tollislab.org/

On Tue, Dec 5, 2023 at 3:08 AM fuesseler @.***> wrote:

I ran into the same problem as @chenyuming123456 https://github.com/chenyuming123456 and was able to resolve the issue. By default, the nextflow config files submit to a partition named "batch" ( process.queue = 'batch’ )

On my HPC, there is no partition named "bash", so I edited the line above in all three config files in the nextflow_config_files directory to specify a partition that actually exists on my cluster.

— Reply to this email directly, view it on GitHub https://github.com/hillerlab/TOGA/issues/115#issuecomment-1840434051, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQSPSSRQH7QTXGLEY465NZTYH3XCLAVCNFSM6AAAAAA7DJLENKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBQGQZTIMBVGE . You are receiving this because you were mentioned.Message ID: @.***>

simone-says commented 11 months ago

I fixed my issue the same way! You might have to ask your HPC admin the name on your cluster.

MichaelHiller commented 11 months ago

Thx a bunch for reporting this. We assumed a 'batch' queue would be the default on all systems; apparently not.

Is there a generic fix for this? E.g. reading out what the default queue is? @kirilenkobm Otherwise, we may simply ask users to specify the queue as a mandatory parameter.

kirilenkobm commented 10 months ago

Adding such an argument to control the queue name would be indeed beneficial. I can do it

kirilenkobm commented 9 months ago

Converted into todo issue https://github.com/hillerlab/TOGA/issues/138