mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
108 stars 26 forks source link

k-mer annotation #270

Open beryl-au opened 1 month ago

beryl-au commented 1 month ago

Hi, We have run annotate_hits_pyseer to annotate k-mers.,but we are not sure if it has mapping quality when annotating k-mers? Do you have any suggestions? Thank you in advance!

johnlees commented 1 month ago

Not in the output file, but they will appear in the bwa output (which you can direct with --tmp-prefix, but it will still be removed at the end of the run I think). You could just run bwa yourself with bwa mem -v 1 -k 8 ref.fa kmers.fa

beryl-au commented 1 month ago

Thanks for your reply, if we use the annotation results( annotated_kmers.txt) of annotate_hits_pyseer directly for analysis, is there a mapping quality for such k-mer annotations?

johnlees commented 1 month ago

I think these are omitted in the output file sorry. @mgalardini does your new pipeline help with this at all?

mgalardini commented 1 month ago

No unfortunately, we are mapping k-mers back using a similar approach. @beryl-au, could you expand a bit on what you would like the mapping quality for? it could be interesting to implement

beryl-au commented 1 month ago

Possibly similar to some of the literature restricting the analysis to reads with a minimum mapping quality of 10 or 20 to determine confident mappings?

mgalardini commented 1 month ago

Sure, that seems like an interesting use case, although usually you would be mapping unitigs back to the same genomes it came from, and so I expect most (?) mappings to be exact matches. Worth looking into eventually though