pangenome / pggb

the pangenome graph builder
https://doi.org/10.1101/2023.04.05.535718
MIT License
346 stars 37 forks source link

Command terminated by signal 6 #374

Closed yeeus closed 4 months ago

yeeus commented 5 months ago

I installed pggb by singularity and I needed to construct a pangenome with 21 haplotypes of human. Here is the error:

[mashmap] Skipping self mappings for single file all-vs-all mapping.
terminate called after throwing an instance of 'std::invalid_argument'
  what():  stoi
Command terminated by signal 6
wfmash -s 5000 -l 25000 -p 90 -n 1 -k 19 -H 0.001 -Y # -t 50 --tmp-base /path/to/pggb /path/to/pggb/input.fasta --lower-triangular --hg-filter-ani-diff 30 -i /path/to/pggb/input.fasta.bf3285f.mappings.wfmash.paf --invert-filtering
0.24s user 0.05s system 43% cpu 0.66s total 45112Kb max memory

The version of pggb is 6ffe7f9 and wfmash is v0.8.4-5-ge7850f9.

Best wishes!

subwaystation commented 5 months ago

To me it looks like the number of secondary mappings does not fit your input data. Are you sure, your sequence names are respecting the PanSN-spec? In the meanwhile, please try to run pggb with --n-mappings 20.

yeeus commented 5 months ago

To me it looks like the number of secondary mappings does not fit your input data. Are you sure, your sequence names are respecting the PanSN-spec? In the meanwhile, please try to run pggb with --n-mappings 20.

I write a script to edit names:

cat seq.file | while read -r -a line; do
    a=`basename ${line[0]} | cut -d '.' -f 1`
    if [[ $a == "MF2_mat" ]] || [[ $a == "mat" ]]; then
        b=`echo ${line[0]} | cut -d '/' -f 3`
        fastix -p "${b}#1#" ${line[0]} > $a.mat.prefix.fa
        fastix -p "${b}#2#" ${line[1]} > $a.pat.prefix.fa
    else
        fastix -p "${a}#1#" ${line[0]} > $a.mat.prefix.fa
        fastix -p "${a}#2#" ${line[1]} > $a.pat.prefix.fa
    fi
done

cat *.fa > input.fasta

but I found this: image does this case cause the error?

subwaystation commented 5 months ago

It does not correspond to the https://github.com/pangenome/PanSN-spec?tab=readme-ov-file#the-pattern using # as the delimiter. pggb usually calculates the number of haplotypes and secondary mappings automatically, but if the sequence names do not fit the spec, things will go wrong.

yeeus commented 5 months ago

Thanks very much, I will have a new try.