Closed pavlo888 closed 3 years ago
A few thoughts: 1) This can happen when the phenotype is exactly correlated with the population structure - check and confirm whether that is the case. 2) 22 samples is very small for a GWAS, and it may not be possible to fit the model. It also looks like you don't have all of the 42 samples in the population structure matrix. 3) Try the LMM instead, as detailed in the best practices.
Hi @johnlees
I think the issue might be that the phenotype is correlated with the population structure, since the phenotype for each genome is also their genomospecies identity. I was trying to replicate the analysis conducted by Gori et al 2020 https://mbio.asm.org/content/11/3/e00728-20/article-info who identified specific genes for each lineage of interest.
I assume then I cannot use pyseer for this end?
Cheers, Pablo
Ah I see, in that case you might want to try with the --no-distances
option and remove the structure:
pyseer --phenotypes phenotype-list.pheno --pres pangenome-w-annot-ref/gene_presence_absence.Rtab --no-distances > genomospecies9_COGs.txt
See also https://pyseer.readthedocs.io/en/master/usage.html?#no-population-structure-correction
I seems to work but then I obtain a blank file as output. Is there anything else I could try? I have tried to follow the snps and k-mer tutorials but I also get errors on those
On Tue, Jan 26, 2021 at 2:24 PM John Lees notifications@github.com wrote:
Ah I see, in that case you might want to try with the --no-distances option and remove the structure:
pyseer --phenotypes phenotype-list.pheno --pres pangenome-w-annot-ref/gene_presence_absence.Rtab --no-distances > genomospecies9_COGs.txt
See also https://pyseer.readthedocs.io/en/master/usage.html?#no-population-structure-correction
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mgalardini/pyseer/issues/135#issuecomment-767538387, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK4VSEQUJHBTRVEU5P6AJMTS327BHANCNFSM4WTLPCBA .
Could you paste the full command and output to the terminal here? Can you also double check that the sample names match in the phenotype file and COG file?
Closing for lack of follow-up messsages
Hi,
I am trying to run the following argument
pyseer --phenotypes phenotype-list.pheno --pres pangenome-w-annot-ref/gene_presence_absence.Rtab --distances mash.tsv --save-m mash_mds --max-dimensions 4 > genomospecies9_COGs.txt
But then I get the following error:
Read 42 phenotypes Detected binary phenotype Structure matrix has dimension (25, 25) Analysing 22 samples found in both phenotype and structure matrix Perfectly separable data error for null model Could not fit null model, exiting
Do you have any idea of what could be the problem?