Closed NullModel closed 8 months ago
I have fixed the script now with the correct lastz_32 version supporting large genomes with >4GB (we had earlier tested and verified the working of large mammalian genomes having >4GB size. However, the recent script did not consider the updated lastz_32 version).
Given three input sequences:
`-rw-r--r-- 1 2867378923 Oct 3 20:18 GCA_011078405.1.fa
-rw-r--r-- 1 2960276849 Oct 3 20:18 GCA_016695395.2.fa
-rw-r--r-- 1 6152963590 Oct 6 21:20 GCA_020510985.1.fa ` This large sequence size:
`faSize GCA_020510985.1.fa
6032307556 bases (52780824 N's 5979526732 real 3026980659 upper 2952546073 lower) in 512 sequences in 1 files
%48.95 masked total, %49.38 masked real ` Is crashing out in lastz:
`[Fri Oct 13 15:57:16 2023] rule lastz: input: results/samples/out.fa, /private/groups/gbrowser/VGP/bigOnes/GCA_020510985.1.fa output: results/alignments/GCA_020510985.1.maf jobid: 7 benchmark: results/benchmarks/GCA_020510985.1.lastz.txt reason: Missing output files: results/alignments/GCA_020510985.1.maf; Input files updated by another job: results/samples/out.fa wildcards: sample=GCA_020510985.1 threads: 2 resources: tmpdir=/data/tmp
FAILURE: in load_fasta_sequence for /private/groups/gbrowser/VGP/bigOnes/GCA_020510985.1.fa, sequence length 4,151,866,407+358,931,967 exceeds maximum (4,294,967,285) [Fri Oct 13 15:59:45 2023] Error in rule lastz: jobid: 7 input: results/samples/out.fa, /private/groups/gbrowser/VGP/bigOnes/GCA_020510985.1.fa output: results/alignments/GCA_020510985.1.maf shell:
Is this a lastz_32 limitation ? There is a 64bit version.