Open eatyourpeas opened 1 week ago
@eatyourpeas do you know how I can recreate this? Tried using:
python manage.py create_csv \
--pts=30 \
--visits="CDCD DHPC ACDC CDCD" \
--hb_target=T \
--age_range=11_15 \
--build \
&& python manage.py create_csv \
--pts=30 \
--visits="CDCCD DDCC CACC" \
--hb_target=A \
--age_range=16_19 \
--build \
&& python manage.py create_csv \
--pts=30 \
--visits="CDC ACDC CDCD" \
--hb_target=T \
--age_range=0_4 \
--build \
&& python manage.py create_csv \
--coalesce
And then the distribution of the Stated gender
column is:
df[['Stated gender']].value_counts()
Stated gender
2 431
0 347
9 326
1 296
Name: count, dtype: int64
The csv creation function is great and generates large dummy csvs very quickly. It does seem as though sex is not being randomly allocated across the
SEX_TYPE
choices. Complicated by the fact this is also an optional parameter to the function.