Open wvictor14 opened 11 months ago
Hey Iciar, so the "amibguous" class is for samples with uncertain predictions, where "uncertain" is defined at some probability cutoff (75% as default). In my paper I show these samples below this threshold to correlate well with mixed genetic ancestry of the three reference ancestries. Because of this, I think "ambiguous" would be better changed to something like "mixed" .
The ethnicity predictor has no way of telling if the queried data is not any of the 3 ancestries used in training data, so I think calling it "other" would be too assumptive and sometimes just wrong.
That makes sense to me! I think "mixed" is an improvement from "ambiguous" anyway.
Originally posted by @iciarfernandez in https://github.com/wvictor14/planet/issues/19#issuecomment-1671927657