lh3 / pangene

Constructing a pangenome gene graph
174 stars 8 forks source link

Genes not included in presence absence matrix #11

Closed malearimond closed 4 weeks ago

malearimond commented 2 months ago

Hello, I am using pangene to determine the presence and absence of genes across various populations of my model species. As recommended I created a pan-gene library and used miniprot to align protein sequences against my assemblies.

Afterwards I used pangene, with parameter setting:

pangene -e 0.85 -l 0.80 -p 0 -c 40 *.paf > ${output_name}.gfa

and created a presence absence matrix using: pangene.js gfa2matrix ${output_name}.gfa > ${output_name}.Rtab

However, upon checking the PAF files and the corresponding PAV matrix, I noticed that some genes were not included in the matrix, even though their identity in the PAF file was greater than 85% and fraction aligned >80%. I have attached an example of one such gene for your reference.

Screenshot 2024-09-09 150850

(First command is the paf file, second the matrix for the same accession).

Could you please help me understand why this might be happening? Have I possibly overlooked something?

Thank you in advance for your assistance!

Best regards, Male

lh3 commented 4 weeks ago

There may be genes mapped better at the loci.