Closed yeeus closed 4 months ago
To me it looks like the number of secondary mappings does not fit your input data. Are you sure, your sequence names are respecting the PanSN-spec? In the meanwhile, please try to run pggb with --n-mappings 20
.
To me it looks like the number of secondary mappings does not fit your input data. Are you sure, your sequence names are respecting the PanSN-spec? In the meanwhile, please try to run pggb with
--n-mappings 20
.
I write a script to edit names:
cat seq.file | while read -r -a line; do
a=`basename ${line[0]} | cut -d '.' -f 1`
if [[ $a == "MF2_mat" ]] || [[ $a == "mat" ]]; then
b=`echo ${line[0]} | cut -d '/' -f 3`
fastix -p "${b}#1#" ${line[0]} > $a.mat.prefix.fa
fastix -p "${b}#2#" ${line[1]} > $a.pat.prefix.fa
else
fastix -p "${a}#1#" ${line[0]} > $a.mat.prefix.fa
fastix -p "${a}#2#" ${line[1]} > $a.pat.prefix.fa
fi
done
cat *.fa > input.fasta
but I found this:
does this case cause the error?
It does not correspond to the https://github.com/pangenome/PanSN-spec?tab=readme-ov-file#the-pattern using #
as the delimiter. pggb usually calculates the number of haplotypes and secondary mappings automatically, but if the sequence names do not fit the spec, things will go wrong.
Thanks very much, I will have a new try.
I installed pggb by singularity and I needed to construct a pangenome with 21 haplotypes of human. Here is the error:
The version of pggb is 6ffe7f9 and wfmash is v0.8.4-5-ge7850f9.
Best wishes!