mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
104 stars 25 forks source link

pyseer Manhattan plot can´t be seen in Phandango #218

Closed rachel1898 closed 1 year ago

rachel1898 commented 1 year ago

Hello! Recently I performed an SNP association with fixed effects model, for resistance phenotype. But once I ran the command line for creating the .plot, and dropping it to phandango no plot is seen. Just the two axes without any information. It actually opens the reference file, the same I used in order to perform snippy, in .gff format, but not the file with the results. In my study, the snp variants are in different CHR and they are all together in the .txt output file from pyseer. Could it be any problem with that? Has anybody had the same problem?

mgalardini commented 1 year ago

Could you share the files you are using in phandango please?

rachel1898 commented 1 year ago

input_phandango_files.zip

Here are the files I used. Thank you very much for your help.

mgalardini commented 1 year ago

There's a mismatch between the chromosome name in your .plot file and the GFF file:

.plot

#CHR    SNP BP minLOG10(P) log10(p)     r^2                                     
6       .       339     0.242604        0.242604        0
6       .       445     0.093665        0.093665        0
6       .       446     0       0       0
6       .       448     0       0       0

Gff file:

##gff-version 3
##sequence-region gnl|Prokka|BOBHOCID_1 1 6264404
gnl|Prokka|BOBHOCID_1   Prodigal:2.6    CDS     483     2027    .       +       0       ID=BOBHOCID_00001;Parent=BOBHOCID_00001_gene;Name=dnaA;db_xref=COG:COG0593;gene=dnaA;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P03004;locus_tag=BOBHOCID_00001;product=Chromosomal replication initiator protein DnaA;protein_id=gnl|Prokka|BOBHOCID_00001
gnl|Prokka|BOBHOCID_1   prokka  gene    483     2027    .       +       .       ID=BOBHOCID_00001_gene;Name=dnaA;gene=dnaA;locus_tag=BOBHOCID_00001
gnl|Prokka|BOBHOCID_1   Prodigal:2.6    CDS     2056    3159    .       +       0       ID=BOBHOCID_00002;Parent=BOBHOCID_00002_gene;Name=dnaN;db_xref=COG:COG0592;gene=dnaN;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:Q9I7C4;locus_tag=BOBHOCID_00002;product=Beta sliding clamp;protein_id=gnl|Prokka|BOBHOCID_00002
gnl|Prokka|BOBHOCID_1   prokka  gene    2056    3159    .       +       .       ID=BOBHOCID_00002_gene;Name=dnaN;gene=dnaN;locus_tag=BOBHOCID_00002
gnl|Prokka|BOBHOCID_1   Prodigal:2.6    CDS     3169    4278    .       +       0       ID=BOBHOCID_00003;Parent=BOBHOCID_00003_gene;Name=recF;db_xref=COG:COG1195;gene=recF;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0A7H0;locus_tag=BOBHOCID_00003;product=DNA replication and repair protein RecF;protein_id=gnl|Prokka|BOBHOCID_00003

In the .plot file you have numbers as chromosome IDs (e.g. 6), while the GFF file has strings (e.g. gnl|Prokka|BOBHOCID_1). If there's no much I don't think anything would be displayed.

rachel1898 commented 1 year ago

You're right, that solved the problem. Thank you very much!!