Closed WMonteith closed 1 year ago
Do all the sample labels match between the unitig file, phenotype file and in the similarity file. In particular, is 51154_2
in all of them?
In theory we should support mismatches, and it looks like we might just need to update our pandas command to reindex
, but it would be good to understand how this is failing so we can add the correct fix.
It's actually a bit confusing, because the line before that one raising the exception defines the shared labels:
https://github.com/mgalardini/pyseer/blob/master/pyseer/input.py#L98
I guess it could mean that the distance matrix is not symmetrical? I will add a further intersection
on the columns of m
to catch these kinds of mistakes
We've run a GWAS using unitigs generated from whole genomes using pyseer. We would like to repeat the same experiment using unitigs generated from short-read data instead of whole-genome sequences.
The original command used was:
pyseer --wg enet --phenotypes phenotypes.txt --kmers Unitigs.txt --uncompressed --distances phylogeny_distances.tsv --alpha 1 --cpu 30 --output-patterns patterns.txt > selected.txt
We've tried to repeat the experiment using unitigs generated from short-read data and with a modified phenotype file and similarity matrix, but we get the following error:
Could you please help us understand why this isn't working?
BW, Billy