This does also highlight a general weakness with setting max_missing_an=0 as default for all datasets. In real datasets, as the number of samples gets larger, so the chance that you'll find variants with no missingness at all get smaller.
Rather it would be better to set a fractional value, i.e., to allow something like max_missing_an=0.01 which would mean keep all variants with at most 1% missing genotype calls.
E.g. with regards to
pca()
Re: PR https://github.com/malariagen/malariagen-data-python/pull/569