Before switching to fixed length string dtypes for sample/variant metadata, None was an appropriate sentinel for missing values. This won't work for fixed length types though so read_plink should use empty strings instead (the None values are currently being coerced to "None").
I would rather not alter the values in the PLINK fam/bim files at all, but string "0" as a missing value sentinel won't be a convention we use anywhere else in sgkit, so it is worth coercing these to empty strings so users can expect a uniform representation for missing values in all string arrays.
Before switching to fixed length string dtypes for sample/variant metadata, None was an appropriate sentinel for missing values. This won't work for fixed length types though so
read_plink
should use empty strings instead (the None values are currently being coerced to "None").I would rather not alter the values in the PLINK
fam
/bim
files at all, but string "0" as a missing value sentinel won't be a convention we use anywhere else in sgkit, so it is worth coercing these to empty strings so users can expect a uniform representation for missing values in all string arrays.