mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
104 stars 25 forks source link

annotation criteria when matching reference #240

Closed snackens closed 1 year ago

snackens commented 1 year ago

Hello, I tried to performe unitig annotation using annotate_hits_pyseer.

In the pyseer tutorial, it was written that annotations marked with ref can partially match between k-mer and reference sequence, while annotations marked with draft require an exact match.

Are there any criterias when matching with reference?

In my case, for example, a unitig(length 30) was not matched with reference even when only 1/30 nucleotide was different from reference sequence.

Thank you in advance.

mgalardini commented 1 year ago

We use bwa mem and bwa fastmap to run the mapping of k-mers to the genomes, with the following flags:

And so you could use these commands directly to see if we have introduced a filter that is leading to your unexpected results.

Hope this helps

snackens commented 1 year ago

Thank you so much.

snackens commented 1 year ago

I'm so sorry after closing. I tried to find the location of unitigs after annotation. 44 significant unitigs were annotated in one gene. Then I was trying to find the location of unitigs, but some of them were negative strand. Almost all of the unitigs were matched with gene sequence(.ffn), but some were reverse. How could I understand this?

Thank you in advance.

snackens commented 1 year ago

Oh, I understood! I'll close again.