hillerlab / make_lastz_chains

Portable solution to generate genome alignment chains using lastz
MIT License
44 stars 8 forks source link

Stalled at doPartition #17

Closed chenyangkang closed 1 year ago

chenyangkang commented 1 year ago

Hi!

I was running the command: make_chains.py MALLARD QUERY_GENOME GCF_015476345.1_ZJU1.0_genomic.fna GCA_907165065.1_bCapEur3.1_genomic.fna --project_dir /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME --force_def --chaining_memory 200000 --seq1_chunk 50000000 --seq2_chunk 10000000

With error output:

Target path: /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.2bit | chrom sizes: /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.chrom.sizes
Query path: /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/QUERY_GENOME.2bit | query sizes: /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/QUERY_GENOME.chrom.sizes
Calling: /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/master_script.sh...
BIN: /beegfs/store4/chenyangkang/software/make_lastz_chains/doLastzChains
PARAMETERS:
    clusterRunDir /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME
    chainMinScore 1000
    chainLinearGap loose
    maxNumLastzJobs 6000
    numFillJobs 1000
    verbose 1
    debug 0
/beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/DEF looks OK!
    tDb=MALLARD
    qDb=QUERY_GENOME
    seq1dir=/beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.2bit
    seq2dir=/beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/QUERY_GENOME.2bit
Will run RepeatFiller. 
Will run chainCleaner with parameters: -LRfoldThreshold=2.5 -doPairs -LRfoldThresholdPairs=10 -maxPairDistance=10000 -maxSuspectScore=100000 -minBrokenChainScore=75000
max number of lastz cluster jobs: 6000
The /beegfs/store4/chenyangkang/software/make_lastz_chains/doLastzChains/doLastzChain.pl run will be performed in directory /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME. All temp files will be written there.
HgStepManager: executing from step 'partition' through step 'cleanChains'.

************ HgStepManager: executing step 'partition' Mon Jun  5 19:59:39 2023.
doPartition ....
doPartition: seq2MaxLength ....
doPartition: seq2MaxLength = 126318510
doPartition: partitionTargetCmd partitionSequence.pl 50000000 0 /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.2bit /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.chrom.sizes -xdir xdir.sh -rawDir /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/TEMP_psl 4000 -lstDir tParts > MALLARD.lst
doPartition: partitionQueryCmd partitionSequence.pl 10000000 10000 /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/QUERY_GENOME.2bit /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/QUERY_GENOME.chrom.sizes 10000 -lstDir qParts > QUERY_GENOME.lst
content of /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/TEMP_run.lastz/doPartition.bash
# chmod a+x /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/TEMP_run.lastz/doPartition.bash
# /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/TEMP_run.lastz/doPartition.bash
+ cd /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/TEMP_run.lastz
+ partitionSequence.pl 50000000 0 /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.2bit /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.chrom.sizes -xdir xdir.sh -rawDir /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/TEMP_psl 4000 -lstDir tParts
lstDir tParts must be empty, but seems to have files  (part001.lst ...)
Uncaught exception from user code:
    Command failed:
    /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/TEMP_run.lastz/doPartition.bash
    HgAutomate::run('/beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_projec...') called at /beegfs/store4/chenyangkang/software/make_lastz_chains/doLastzChains/HgRemoteScript.pm line 117
    HgRemoteScript::execute('HgRemoteScript=HASH(0x20cb858)') called at /beegfs/store4/chenyangkang/software/make_lastz_chains/doLastzChains/doLastzChain.pl line 370
    main::doPartition() called at /beegfs/store4/chenyangkang/software/make_lastz_chains/doLastzChains/HgStepManager.pm line 169
    HgStepManager::execute('HgStepManager=HASH(0x20c8e20)') called at /beegfs/store4/chenyangkang/software/make_lastz_chains/doLastzChains/doLastzChain.pl line 877
Error!!! Output file /beegfs/store4/chenyangkang/06.ebird_data/43.Phenology_project_files/data/chains/QUERY_GENOME/MALLARD.QUERY_GENOME.allfilled.chain.gz not found!
The pipeline crashed. Please contact developers by creating an issue at:
https://github.com/hillerlab/make_lastz_chains

The header of the sequence is quite simple:

>BCE_CHR_109
>BCE_CHR_110
>BCE_CHR_111

or

>ZJU_CHR_748
>ZJU_CHR_749
>ZJU_CHR_750

Any hint? Thanks!

Yangkang

chenyangkang commented 1 year ago

Oh I think I found the reason. The faToTwoBit need a library(libcrypto.so.1.0.0) that I don't have. After installing that and remove everything in my target output directory, the program works well.