zalando / expan

Open-source Python library for statistical analysis of randomised control trials (A/B tests)
MIT License
334 stars 50 forks source link

Possible to clarify what should be included in "data?" #257

Open pjatx opened 2 years ago

pjatx commented 2 years ago

Hey There,

Currently trying to push some google optimize data through this library and am struggling a bit with out to format the dataframe.

Here's what a CSV export looks like below. Is the data here sufficient to create a dataframe that expan will accept? If so, can you provide any guidance on how best to format the dataframe?

Day Index Experiment Sessions (Original) Experiment Conversions (Original) Experiment Sessions (Variant) Experiment Conversions (Variant)
"Jan 14, 2022" "2,738" 0 "2,646" 3
"Jan 15, 2022" "11,406" 3 "11,555" 4
"Jan 16, 2022" "15,192" 3 "15,289" 4
"Jan 17, 2022" "15,416" 6 "15,534" 12
"Jan 18, 2022" "13,499" 9 "13,661" 4
"Jan 19, 2022" "14,200" 5 "14,098" 9
"Jan 20, 2022" "15,503" 5 "15,637" 12
"Jan 21, 2022" "12,532" 4 "12,401" 9
"Jan 22, 2022" "13,534" 5 "13,679" 5
"Jan 23, 2022" "14,405" 3 "14,770" 5
"Jan 24, 2022" "14,800" 3 "14,916" 7
"Jan 25, 2022" "15,785" 4 "16,179" 12
"Jan 26, 2022" "15,060" 8 "14,907" 10
"Jan 27, 2022" "14,891" 6 "15,322" 8
"Jan 28, 2022" "14,060" 12 "13,970" 9
"Jan 29, 2022" "3,613" 0 "3,489" 2