10/22: Berk Can Deniz - Githubissues

ehuppert commented 3 years ago

Comment below with questions or thoughts about the reading for this week's workshop.

Please make your comments by Wednesday 11:59 PM, and upvote at least five of your peers' comments on Thursday prior to the workshop. You need to use 'thumbs-up' for your reactions to count towards 'top comments,' but you can use other emojis on top of the thumbs up.

chentian418 commented 3 years ago

Thanks for the well-organized and interesting paper! As a data science lover, I am glad to the adoption of A/B testing decreases the likelihood of radical change and even makes website more likely to change incrementally. Other than the industries, I also find areas like social science have applied A/B testing or similar approaches quite frequently. For example, when designing question of surveys, researchers would also perform a similar A/B testing for further comparison of the outcomes. Therefore, I was wondering how to extrapolate the influence that A/B testing have on tech industries to the social science domain, and how could we apply them and evaluate the effects in a more scientific way?

xzmerry commented 3 years ago

Thanks for your interesting paper! I have a question that if you want to generate the analysis based on newspapers to different countries, what challenges might you catch? More precisely, whether there will be some different procedures when analyzing newspapers of different languages other than English? Thanks!

YijingZhang-98 commented 3 years ago

This paper focuses on an interesting issue and shows us an impressive analysis and result. I am surprised to see the application of A/B testing would not necessarily bring as much benefit to this newspaper company. Given this paper mainly focus on newspaper companies, maybe it's because the data is more accessible, my question is could we improve this research design and generalize this conclusion to other industry? I know a lot of retailing companies also apply A/B testing to improve the selling, like changing the context of advertisement and website design. Is there anything need to be paid attention to when research on Retailing industry?

j2401 commented 3 years ago

I have similar question as @anqi-hu How would you go about measuring changes in cases such that magnitudes are measured through different source of components such as wordings etc. Thank you and look forward to your presentation tomorrow.

luyingjiang commented 3 years ago

Thank you for sharing! I have a more general question about A/B testing. After the experiment, if I found the p-value is less than 0.05, which means the new algorithm is much better than the old one, does it represent that the new algorithm is good to launch?

qishenfu1 commented 3 years ago

Thanks for the wonderful sharing! When reading your paper, I am curious about how the improvement of the functions of the websites can encourage the users to use it?

Anqi-Zhou commented 3 years ago

Thanks for the sharing! Could you explain more about the mechanism behind it? So, which potential mechanism do you prefer?

jamesallenevans commented 3 years ago

Loved this paper. But incrementalism is not necessary to experimentation, but likely the optimization approach and the distribution of (design) starting points. For example, consider simulated annealing with a high temperature parameter which allows it to (initially) search broadly...or the following optimization approach in a recent Nature paper: "Full exploration of such a space is unfeasible, so we developed an algorithm that performs Bayesian optimization based on Gaussian process regression and parallel search strategy35 (see Methods). To generate a new batch, we build a surrogate model predicting the HER of potential formulations based on the measurements performed so far and quantify the uncertainty of prediction. Subsequent sampling points are chosen using a capitalist acquisition strategy, where a portfolio of upper confidence bound functions is generated on an exponential distribution of greed to create markets of varying risk aversion, which are searched for global maxima. Each market is given an agent that searches to return a global maximum, or batch of k-best maxima. The uneven distribution of greed allows some suggested points to be highly exploitative, some to be highly explorative, and most to be balanced, thus making the strongest use of the parallel batch experiments."

Using these type of explore hypotheses...AND/OR a wider space of starting points, could actually push the system to explore further. Any ideas on what ASPECT of A/B testing is driving conservatism, because the institution of the experiment itself would not seem to?

uchicago-computation-workshop / Fall2020

10/22: Berk Can Deniz #5