hillerlab / make_lastz_chains

Portable solution to generate genome alignment chains using lastz
MIT License
46 stars 8 forks source link

V2 pipeline errors (cat_step; with my data NOT with test_data); V1 pipeline error:pipeline crashed #51

Open vinitamehlawat opened 7 months ago

vinitamehlawat commented 7 months ago

Hi @kirilenkobm @MichaelHiller,

I wget https://github.com/hillerlab/make_lastz_chains/archive/refs/heads/main.zip

installed all dependencies

  1. Able to run make_chains.py successfully on test_data and got chained alignment BUT when I tried with my own data
  2. after run of almost 16 hrs I got following error:

    Lastz Alignment Step

LASTZ: making jobs LASTZ: saved 968 jobs to /home/vlamba/make_genome-chaining-Feb3/test/temp_lastz_run/lastz_joblist.txt Parallel manager: pushing job /share/apps/bioinformatics/nextflow/20.10.0/nextflow /scrfs/storage/vlamba/home/make_lastz_chains-main/parallelization/execute_joblist.nf --joblist /home/vlamba/make_genome-chaining-Feb3/test/temp_lastz_run/lastz_joblist.txt -c /home/vlamba/make_genome-chaining-Feb3/test/temp_lastz_run/lastz_config.nf

Nextflow process lastz finished successfully

Found 8 output files from the LASTZ step Please note that lastz_step.py does not produce output in case LASTZ could not find any alignment

Concatenating Lastz Results (Cat) Step

Concatenating LASTZ output from 8 buckets

Then I switch to V1 make_lastz_chains-1.0.0 and with test data I got following error:

[d1/4aa5ff] NOTE: Process execute_jobs (334) terminated with an error exit status (127) -- Execution is retried (2) [b5/4f2f5d] NOTE: Process execute_jobs (332) terminated with an error exit status (127) -- Execution is retried (2) [9c/4a16b7] NOTE: Process execute_jobs (625) terminated with an error exit status (127) -- Execution is retried (3) [c7/851fa8] NOTE: Process execute_jobs (678) terminated with an error exit status (127) -- Execution is retried (2) [ea/b72980] NOTE: Process execute_jobs (719) terminated with an error exit status (127) -- Execution is retried (2) Error executing process > 'execute_jobs (67)'

Caused by: Process execute_jobs (67) terminated with an error exit status (127)

Command executed:

/scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/test_out2/TEMP_run.fillChain/runRepeatFiller.sh /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/test_out2/TEMP_run.fillChain/jobs/infillChain_158

Command exit status: 127

Command output: ..calling RepeatFiller:

Command error: /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/test_out2/TEMP_run.fillChain/runRepeatFiller.sh: line 12: --workdir: command not found

Work dir: /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/test_out2/TEMP_run.fillChain/fillChain_targetquery/work/21/08e1e7d44488e6d8c672adaff8864f

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

/home/vlamba/python3.14/lib/python3.9/site-packages/py_nf/py_nf.py:404: UserWarning: Nextflow pipeline fillChain_targetquery failed! Execute function returns 1. warnings.warn(msg) Uncaught exception from user code: Command failed: /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/test_out2/TEMP_run.fillChain/doFillChain.sh HgAutomate::run('/scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/test_out2/...') called at /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/doLastzChains/HgRemoteScript.pm line 117 HgRemoteScript::execute('HgRemoteScript=HASH(0xc3cb78)') called at /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/doLastzChains/doLastzChain.pl line 735 main::doFillChains() called at /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/doLastzChains/HgStepManager.pm line 169 HgStepManager::execute('HgStepManager=HASH(0xc39a18)') called at /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/doLastzChains/doLastzChain.pl line 877 Error!!! Output file /scrfs/storage/vlamba/home/make_lastz_chains-1.0.0/test_out2/target.query.allfilled.chain.gz not found! The pipeline crashed. Please contact developers by creating an issue at: https://github.com/hillerlab/make_lastz_chains

I would really appreciate any help/suggestion.

Thank you so much for your time

kirilenkobm commented 7 months ago

Hi @vinitamehlawat

thank you for reporting this. Let's have a look.

In the v1, the master script creates a temporary script "$runDir/runRepeatFiller.sh"; where it inserts the following line:

    print $fh "$RepeatFiller --workdir $runDir --chainExtractID $chainExtractID --lastz $lastz --axtChain $axtChain --chainSort $chainSort -c \$chainf -T2 \$rseq -Q2 \$qseq $param --lastzParameters '$lastzParameters ' | $scoreChain -linearGap=$chainLinearGap $scoreChainParameters stdin \$rseq \$qseq stdout | $chainSort stdin $filledDir/\$tnamechainf.chain\n";

where

my $RepeatFiller = `which RepeatFiller.py`; chomp($RepeatFiller);

It feels like the command which RepeatFiller.py returned nothing, so bash interpreted the inserted line as a command starting with "--workdir" and crashed. And no sanity check was implemented to prevent this. I believe, something similar happened in the v2...

@MichaelHiller @osipovarev could you please let me and @vinitamehlawat where we can find a script called RepeatFiller.py? Then I would recommend @vinitamehlawat to place it together with other dependencies.

I would also suggest you to align single chromosomes for tests, not the whole genomes.

osipovarev commented 7 months ago

the latest version of RepeatFiller.py script can be found here: https://github.com/hillerlab/GenomeAlignmentTools/blob/master/src/RepeatFiller.py

Hopefully, it works with this release of make_lastz_chains !

MichaelHiller commented 7 months ago

Vinita, could you pls check if you had an older version of RepeatFiller.py on your system? And if a clean new installation of make_chains provides the new RepeatFiller version?

Would be good to understand where this come from. Maybe we need to fix something on our side.

Thx

MichaelHiller commented 7 months ago

@kirilenkobm I was reading the comments not in the right order, sorry. But Katya pointed to the latest version. Would be great if this pipeline can fetch this version automatically and reference it, even if other (older) RepeatFiller versions exist on the system. Thx !

vinitamehlawat commented 7 months ago

Hello Dr. @MichaelHiller

I checked for repeat filler, I don't have any such tool in my system.