Open julibeg opened 3 years ago
I think this is due to the fact that we do not really expect missing values in an .Rtab file, whereas they can be quite common in vcf files. I think we could implement your proposed change to be more consistent.
Just out of curiosity, was the .Rtab file you were using coming out of panaroo/roary? If so I was not aware of the fact that it could contain missing values
Makes sense.
No, it was a custom .Rtab file.
Ok that makes sense. If you would like to open a PR we could merge this change. If you know how to add unit tests that would also be great. If not, I can do it once the change it's merged
will do
Missing genotypes in variant files are ignored: https://github.com/mgalardini/pyseer/blob/2e27979568ee34f02d000ca3011002b9d399fb38/pyseer/input.py#L485-L486 However, in .Rtab files they are treated as missing data and the fit fails later on: https://github.com/mgalardini/pyseer/blob/2e27979568ee34f02d000ca3011002b9d399fb38/pyseer/input.py#L423-L424
Is this intended? For now I have replaced
d[sample] = np.nan
withcontinue
to also get a fit for genes with a few missing entries.