mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
104 stars 25 forks source link

mapping snps #185

Closed annarerra closed 2 years ago

annarerra commented 2 years ago

Hello,

there is a way to "map" or "annotate" the snps provided by pyseer in terms of position in the different contigs as it is done for the unitigs?

Thank you in advance :)

johnlees commented 2 years ago

I'm not sure exactly what you mean, but I don't think so. If you are using a VCF, the ID in the output should include contig and position in the output?

annarerra commented 2 years ago

For the unitigs for example, after annotation for every unitig we have a colomn with this contig:position;upstream;in;downstream info.

But for SNPs there is no such an "annotation" step and we have only these info: image

johnlees commented 2 years ago

What data and file are you using as input? What command are you running here?

annarerra commented 2 years ago

input: phenotypes + gene presence absence from roary + model command: pyseer --lmm --phenotypes {input.pheno} --pres {input.presence} --load-lmm {input.model} --cpu {threads} --output-patterns {output.patterns} --min-af 0.02 --max-af 0.98 > {output.results}

johnlees commented 2 years ago

So these are genes then, not SNPs?

annarerra commented 2 years ago

yeah, u are right, i am sorry i had genes in mind and i was talking about SNPs :/

but still my question is about the position.

annarerra commented 2 years ago

For example if i want to do a manhattan plot, not using phandango, i would need some positions in the genome, or in a contig. Probably is not provided by pyseer, i should do it externaly.

johnlees commented 2 years ago

Ah in that case, no, I'm sorry but we don't support these. Gene order varies from genome to genome, so it wouldn't be possible to output a simple summary of nearby genes. I'd recommend panaroo for that kind of analysis.