Open julibeg opened 3 years ago
Right, I guess this can happen if the frequency filtering is turned off. I added the 'no observations' warning as when the sample labels mismatched every variant will trigger this and it's obvious something is wrong, whereas without it and using typical MAF filters it will just go through and ignore every variant, leaving you with an empty output.
We should probably skip these all 0/all 1 variants rather than leaving the allele filtering to sort them out though, I think we do that in the elastic net anyway
Ah, I didn't realise that this usually wouldn't happen due to the MAF filters. Thanks for clarifying!
When fitting an elastic net on .Rtab files that have columns with only zeros, pyseer gives the warning
No observations of [variant] in selected samples
, but the program completes. Therefore, I assumed it would somehow simply ignore those variants. However, when running SEER,scipy.stats.chi2_contingecy()
throwsValueError: The internally computed table of expected frequencies has a zero element at (0, 0)
.Even though it is relatively obvious of course, on first glance it was not clear to me that those variants indeed caused the error and that I should remove them from the input. In case this behaviour (i.e. throwing an error and not filtering those variants internally) is intended, could the warning be changed to more explicitly state that such variants are not allowed?