hillerlab / make_lastz_chains

Portable solution to generate genome alignment chains using lastz
MIT License
44 stars 8 forks source link

chain_run error #55

Open LliliansCalvo opened 5 months ago

LliliansCalvo commented 5 months ago

Hi,

I am trying it to run it and it fails at what looks like the last step after a long time running. I dont understand what this error means or how to fix it.

My command:
./make_chains.py dm6 CGA /Gilly_TOGA/dm6.fa /Gilly_TOGA/mod_GCA_030586385.1_ASM3058638v1_genomic.fna --pd make_chains_Cfel_2_dme -f --cluster_executor slurm --cluster_queue cpu --nextflow_executable ~/.conda/envs/llilians_env/bin/nextflow

# Make Lastz Chains #
Version 2.0.8
Commit: 187e313afc10382fe44c96e47f27c4466d63e114
Branch: main

* found run_lastz.py at /Gilly_TOGA/make_lastz_chains/standalone_scripts/run_lastz.py
* found run_lastz_intermediate_layer.py at /Gilly_TOGA/make_lastz_chains/standalone_scripts/run_lastz_intermediate_layer.py
* found chain_gap_filler.py at /Gilly_TOGA/make_lastz_chains/standalone_scripts/chain_gap_filler.py
* found faToTwoBit at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/faToTwoBit
* found twoBitToFa at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/twoBitToFa
* found pslSortAcc at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/pslSortAcc
* found axtChain at /users/lcalvogo/.conda/envs/llilians_env/bin/axtChain
* found axtToPsl at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/axtToPsl
* found chainAntiRepeat at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/chainAntiRepeat
* found chainMergeSort at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/chainMergeSort
* found chainCleaner at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/chainCleaner
* found chainSort at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/chainSort
* found chainScore at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/chainScore
* found chainNet at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/chainNet
* found chainFilter at /Gilly_TOGA/make_lastz_chains/HL_kent_binaries/chainFilter
* found lastz at /users/lcalvogo/.conda/envs/llilians_env/bin/lastz
* using nextflow manually located at nextflow
All necessary executables found.
Making chains for /Gilly_TOGA/dm6.fa and /Gilly/genome/mod_GCA_030586385.1_ASM3058638v1_genomic.fna files, saving results to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme
Pipeline started at 2024-03-15 17:31:49.461257
* Setting up genome sequences for target
genomeID: dm6
input sequence file: /Gilly_TOGA/dm6.fa
is 2bit: False
planned genome dir location: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/target.2bit
Initial fasta file /Gilly_TOGA/dm6.fa saved to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/target.2bit
For dm6 (target) sequence file: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/target.2bit; chrom sizes saved to: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/target.chrom.sizes
* Setting up genome sequences for query
genomeID: CGA
input sequence file: /Gilly/genome/mod_GCA_030586385.1_ASM3058638v1_genomic.fna
is 2bit: False
planned genome dir location: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/query.2bit
Initial fasta file /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/CGA_renamed_chrom.fa saved to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/query.2bit
For CGA (query) sequence file: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/query.2bit; chrom sizes saved to: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/query.chrom.sizes
Warning! Genome sequence file /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/CGA_renamed_chrom.fa
1197 chromosome names cannot be processed via pipeline
were renamed in the intermediate files according to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/CGA_chrom_rename_table.tsv

### Partition Step ###

# Partitioning for target
Saving partitions and creating 19 buckets for lastz output
In particular, 0 partitions for bigger chromosomes
And 19 buckets for smaller scaffolds
Saving target partitions to: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/target_partitions.txt
# Partitioning for query
Saving partitions and creating 13 buckets for lastz output
In particular, 0 partitions for bigger chromosomes
And 13 buckets for smaller scaffolds
Saving query partitions to: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/query_partitions.txt
Num. target partitions: 0
Num. query partitions: 0
Num. lastz jobs: 0

### Lastz Alignment Step ###

LASTZ: making jobs
LASTZ: saved 247 jobs to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_lastz_run/lastz_joblist.txt
Parallel manager: pushing job /users/lcalvogo/.conda/envs/llilians_env/bin/nextflow /Gilly_TOGA/make_lastz_chains/parallelization/execute_joblist.nf --joblist /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_lastz_run/lastz_joblist.txt -c /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_lastz_run/lastz_config.nf

### Nextflow process lastz finished successfully
Found 19 output files from the LASTZ step
Please note that lastz_step.py does not produce output in case LASTZ could not find any alignment

### Concatenating Lastz Results (Cat) Step ###

Concatenating LASTZ output from 19 buckets
* concatenated bucket bucket_ref_bulk_17 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_0.psl.gz
* skip bucket bucket_ref_bulk_3: nothing to concat
* concatenated bucket bucket_ref_bulk_12 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_2.psl.gz
* concatenated bucket bucket_ref_bulk_6 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_3.psl.gz
* concatenated bucket bucket_ref_bulk_8 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_4.psl.gz
* skip bucket bucket_ref_bulk_16: nothing to concat
* concatenated bucket bucket_ref_bulk_4 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_6.psl.gz
* concatenated bucket bucket_ref_bulk_9 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_7.psl.gz
* concatenated bucket bucket_ref_bulk_19 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_8.psl.gz
* concatenated bucket bucket_ref_bulk_7 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_9.psl.gz
* concatenated bucket bucket_ref_bulk_18 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_10.psl.gz
* concatenated bucket bucket_ref_bulk_5 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_11.psl.gz
* concatenated bucket bucket_ref_bulk_13 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_12.psl.gz
* concatenated bucket bucket_ref_bulk_10 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_13.psl.gz
* concatenated bucket bucket_ref_bulk_15 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_14.psl.gz
* concatenated bucket bucket_ref_bulk_2 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_15.psl.gz
* concatenated bucket bucket_ref_bulk_14 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_16.psl.gz
* concatenated bucket bucket_ref_bulk_11 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_17.psl.gz
* concatenated bucket bucket_ref_bulk_1 to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_18.psl.gz
Concatenated 155 files in total into 17 files

### Build Chains Step ###

Sorting PSL files, saving the results to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/sorted_psl
/Gilly_TOGA/make_lastz_chains/HL_kent_binaries/pslSortAcc nohead /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/sorted_psl /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_kent /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_3.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_6.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_10.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_11.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_17.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_2.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_4.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_13.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_16.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_14.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_12.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_18.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_9.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_0.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_15.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_7.psl.gz /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_concat_lastz_output/concat_8.psl.gz
Bundling psl files with the following arguments:
* input_dir: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/sorted_psl
* chrom_sizes: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/target.chrom.sizes
* output_dir: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl
* max_bases: 1000000
* warning_only: False
* verbose: False
Saving results to: /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl
Bundling 85 psl files in total
Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.0.psl
Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.1.psl
Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.2.psl
Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.3.psl
Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.4.psl
Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.5.psl
Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.6.psl
    --> file chrUn_CP007074v1.psl does not exist. Next
    --> file chrY_CP007108v1_random.psl does not exist. Next
    --> file chrUn_CP007090v1.psl does not exist. Next
    --> file chrUn_DS483562v1.psl does not exist. Next
    --> file chrUn_CP007105v1.psl does not exist. Next
    --> file chrUn_CP007087v1.psl does not exist. Next
    --> file chrUn_CP007085v1.psl does not exist. Next
    --> file chrUn_CP007072v1.psl does not exist. Next
    --> file chrY_CP007118v1_random.psl does not exist. Next
    --> file chrUn_CP007098v1.psl does not exist. Next
    --> file chrY_CP007112v1_random.psl does not exist. Next
    --> file chrUn_CP007088v1.psl does not exist. Next
    --> file chrUn_CP007077v1.psl does not exist. Next
    --> file chrUn_CP007082v1.psl does not exist. Next
    --> file chrY_CP007111v1_random.psl does not exist. Next
    --> file chrY_CP007113v1_random.psl does not exist. Next
    --> file chrX_CP007103v1_random.psl does not exist. Next
    --> file chrY_CP007110v1_random.psl does not exist. Next
    --> file chrY_CP007114v1_random.psl does not exist. Next
    --> file chrUn_CP007094v1.psl does not exist. Next
    --> file chrUn_CP007092v1.psl does not exist. Next
    --> file chrUn_DS483705v1.psl does not exist. Next
    --> file chrX_CP007104v1_random.psl does not exist. Next
    --> file chrUn_DS483707v1.psl does not exist. Next
    --> file chrY_CP007116v1_random.psl does not exist. Next
    --> file chrUn_CP007093v1.psl does not exist. Next
    --> file chrUn_CP007095v1.psl does not exist. Next
    --> file chrUn_CP007083v1.psl does not exist. Next
    --> file chrUn_CP007101v1.psl does not exist. Next
    --> file chrUn_CP007079v1.psl does not exist. Next
    --> file chrUn_CP007086v1.psl does not exist. Next
    --> file chrUn_CP007078v1.psl does not exist. Next
    --> file chrY_CP007115v1_random.psl does not exist. Next
    --> file chrUn_DS483723v1.psl does not exist. Next
    --> file chrUn_CP007091v1.psl does not exist. Next
    --> file chrUn_CP007071v1.psl does not exist. Next
    --> file chrUn_DS483709v1.psl does not exist. Next
    --> file chrUn_CP007089v1.psl does not exist. Next
    --> file chrUn_DS483734v1.psl does not exist. Next
    --> file chrUn_DS483629v1.psl does not exist. Next
    --> file chrUn_DS483735v1.psl does not exist. Next
    --> file chrUn_DS483641v1.psl does not exist. Next
    --> file chrUn_DS483712v1.psl does not exist. Next
    --> file chrUn_DS483646v1.psl does not exist. Next
    --> file chrUn_DS483647v1.psl does not exist. Next
    --> file chrUn_DS483736v1.psl does not exist. Next
[]
# Many like this but I removed them for space

Written to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/split_psl/bundle.7.psl
DONE. Produced 8 files
PSL bundle sub-step done
Building axtChain joblist for 8 bundled psl files
Saving 8 axtChain jobs to /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/chains_joblist
Parallel manager: pushing job /users/lcalvogo/.conda/envs/llilians_env/bin/nextflow /Gilly_TOGA/make_lastz_chains/parallelization/execute_joblist.nf --joblist /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/chains_joblist -c /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/chain_run_config.nf

### Nextflow process chain_run finished successfully
An error occurred while executing chain_run: Error! No non-empty files found at /Gilly_TOGA/make_lastz_chains/make_chains_Cfel_2_dme/temp_chain_run/chain. The failed operation label is: chain_run

Hope someone can help

MichaelHiller commented 5 months ago

I am pretty sure that "chrUn_DS483735v1.psl does not exist" indicates that no alignments were found for this reference scaffold. Given that drosophila and ants are separated by a huge evolutionary distance, this is not surprising (even Drosophila - Anopheles is already >1 subs per neutral site).

However, I don't understand why it crashes with this error message. @kirilenkobm Do you know what that means?

LliliansCalvo commented 4 months ago

Thanks for the answer. I tried to install and run again just with the example and I get the same chain_run error:

./make_chains.py target query test_data/test_reference.fa test_data/test_query.fa --pd test_out -f --chaining_memory 16

### Nextflow process chain_run finished successfully
An error occurred while executing chain_run: Error! No non-empty files found at /work/FAC/FBM/make_lastz_chains/test_out/temp_chain_run/chain. The failed operation label is: chain_run
Traceback (most recent call last):
  File "/work/FAC/FBM/make_lastz_chains/modules/step_manager.py", line 70, in execute_steps
    step_result = step_to_function[step](params, project_paths, step_executables)
  File "/work/FAC/FBM/make_lastz_chains/modules/pipeline_steps.py", line 64, in chain_run_step
    do_chain_run(params, project_paths, executables)
  File "/work/FAC/FBM/make_lastz_chains/steps_implementations/chain_run_step.py", line 109, in do_chain_run
    has_non_empty_file(project_paths.chain_output_dir, "chain_run")
  File "/work/FAC/FBM/make_lastz_chains/modules/common.py", line 51, in has_non_empty_file
    raise PipelineFileNotFoundError(err_msg)
modules.error_classes.PipelineFileNotFoundError: Error! No non-empty files found at /work/FAC/FBM/make_lastz_chains/test_out/temp_chain_run/chain. The failed operation label is: chain_run