Open hyanwong opened 5 months ago
E.g. we can get 2 "C" values in ds['variant_allele']:
ds['variant_allele']
import sgkit as sg import numpy as np ds = sg.simulate_genotype_call_dataset(n_variant=10, n_sample=4, missing_pct=0, phased=True, seed=1) for i, alleles in enumerate(ds['variant_allele'].values): print(f"Site {i}: {alleles}") assert len(np.unique(alleles)) == len(alleles)
Fails on site 6:
Site 6: [b'T' b'T'] --------------------------------------------------------------------------- AssertionError
This can cause much confusion in downstream analysis. See https://github.com/tskit-dev/tsinfer/issues/927
E.g. we can get 2 "C" values in
ds['variant_allele']
:Fails on site 6:
This can cause much confusion in downstream analysis. See https://github.com/tskit-dev/tsinfer/issues/927