04/08 Eytan Bakshy - Githubissues

ehuppert commented 3 years ago

Comment below with questions or thoughts about the reading for this week's workshop.

Please make your comments by Wednesday 11:59 PM, and upvote at least five of your peers' comments on Thursday prior to the workshop. You need to use 'thumbs-up' for your reactions to count towards 'top comments,' but you can use other emojis on top of the thumbs up.

nwrim commented 3 years ago

Thank you so much for visiting our workshop, Dr. Bakshy! I have been playing with the Facebook Ax/BoTorch platform for a while and it is amazing (and it has a lot of great developers that answer so effectively to my trivial questions on the issue page!)

In your case study with fetching parameters for the Facebook android app, you mentioned that the full factorial design will be infeasibly large (8 5), so the domain expertise was used to limit the search space initially. Your results suggest that BayesOpt could come to the rescue by treating it as 5-d continuous space so that maybe we can even start with the 8 5 space without pruning it with domain expertise in the first place. Building on this, what if there was no continuous space to convert to (the categories are not ordered)? Would BayesOpt (or just optimization) still be possible to work on these spaces effectively?

Another question I have is whether BayesOpt + A/B Testing will be plausible when the budget is severely restricted (e.g., Academia). By having humans in the loop, every measurement gets insanely noisy. I feel like A/B testing with practical purposes in mind (e.g., better Facebook user experience) can handle a lot of ns for each parameter measurement, but maybe a relatively underfunded academic project might not be able to get such large ns. Do you think this could be a problem, or are there advances in the BayesOpt field that deal with this problem?

Thanks again for coming to our workshop! Really looking forward to the talk.

JadeBenson commented 3 years ago

My question is perhaps a bit outside the scope of this research, as you primarily focus on how A/B testing can be improved and best applied to various case studies. I've heard some complaints about how A/B testing has become overused for even miniscule changes, which adds a lot of development time without huge impacts and is relatively tedious for teams to execute. Do you have any similar concerns with the frequency of A/B testing? What do the conversations look like behind-the-scenes about whether and what other methodologies should be applied?

SoyBison commented 3 years ago

I would mainly like to reinforce @JadeBenson 's question. I have heard similar things, including a study from our workshop two quarters ago. The situation is often described as using bayesian optimization to figure out problems that there is no theory on, but perhaps there is no theory on these situations because it doesn't matter, and the effect shown by bayesian optimziation is just enhancing random noise.

Thanks for coming to our workshop!

vinsonyz commented 3 years ago

Thank you for your presentation, Dr. Bakshy. I really enjoyed reading your research because I am a marketing researcher. Do you have any suggestions on experimental design for consumers?

LFShan commented 3 years ago

Thank you for the presentation. As a student with a concentration in Economics, do you have any suggestions on how A/B test could be applied in the field of economics? Thank you.

FranciscoRMendes commented 3 years ago

Thank you so much for coming, I simply love your work and I think this is where experimental econ is going. I am in List's graduate class in Experimental Econ and was just reading Athey's paper on ANNs to improve credibility of MC simulations. So to see something like this in action at Facebook albeit behind the scenes is pretty motivating to me.

Using BayesOpt based on experimental results is a great idea, I think that it allows you to simulate what it would have been like to conduct several more A/B tests than you actually did. But I am not sure, if you selected a subset of the parameters and then went back and did an A/B test just as a sanity check? For example, when you optimized over the 5d space using samples from the 8^5 factorial space, did you discretize let's say the top 10 parameter configs and then go back and do a natural experiment? Just as a sanity check?
In case of evaluating a treatment over a longer period, did you consider that some combinations of parameters need time for users to "learn" them. Meaning that, a low value in A/B testing may not be reflective of their true value?
Finally, going from a Bandit to BayesOpt seems "natural" to me, more than going from contextual policy to BayesOpt. There is just so much that can happen in the C x d space that I am not sure throwing it into a blackbox and optimizing over it made sense to me, even if you lowered the dimensionality (I probably didnt understand this section, I look forward to a more detailed technical discussion).

I do have many more questions but let me stop here. Thanks again!

I mean I think this one line sort of says it all to me "humans behavior on facebook is influenced by a host of complex issues including past behavior" , so even in section 3.4 where youve created these offline estimates (very neat solution btw) , it would be great if you went back and did an experimental observation to check at least some of those (since some of those violated constraints too)

alevi98 commented 3 years ago

Thanks for coming to our workshop. It'll be really interesting to hear how A/B testing is impacting the field. Can't wait for the talk!

FranciscoRMendes commented 3 years ago

Thank you for the presentation. As a student with a concentration in Economics, do you have any suggestions on how A/B test could be applied in the field of economics? Thank you.

Esther Duflo and Abhijit Banerjee use them a LOT for things as simple as distributing mosquito nets in developing countries. https://www.poverty-action.org/study/free-distribution-or-cost-sharing-evidence-malaria-prevention-experiment-kenya Dev Econ is moving in this direction. Uber does a lot of this in their pricing strategies, they'll sort of test out a price over a range $10-$15 and then use customer usage data to check if that works. Susan Athey gave a talk on this , you can find this here (https://www.forbes.com/sites/quora/2016/04/05/how-do-academic-economists-use-ab-testing/?sh=304140361dec).

I look forward to Eytan's answer, just thought I'd nuke you with some additional info!!

Lynx-jr commented 3 years ago

I don't really have a grand question but as Coen @SoyBison said, this week's paper reminds me about the fall quarter's Stanford A/B test paper. And Bayes Opt seems useful in the applications of A/B tests in general, could you give a few other examples outside the paper that we can utilize Bayes Opt and NEI? Plus, can anyone suggest a course that teaches A/B testing? I don't have prior experiences and I'm really curious to learn more, thanks! Oh and thanks for coming to our workshop!

Yutong0828 commented 3 years ago

Since I am not really familiar with Bayesian Optimization, I was wondering what its specific advantages are compared with other approaches you are used to adopting when tuning models. Besides, I am also interested in the A/B tests used in industry. Do you think there are any different concerns for doing such experiments for practical use in a company, compared with doing that in an academic study at school? Thank you very much!

ginxzheng commented 3 years ago

Thank you so much for coming! I am very curious about the recommendation systems. In the paper, you cited Eckles et al. , an individual's behavior will be influenced by complex patterns of how they interact with others in previous sessions. Would you elaborate more on how is the value model being balanced, in terms of business logic? and any examples of "key metrics" you mentioned?

bowen-w-zheng commented 3 years ago

Thank you for presenting your works! I am curious about if MAB could be applied to the field of social science, especially those that place a heavy emphasis on inference. In a broader context, I am curious about your thoughts on whether scientists should use statistical tools whose properties are not well understood but seem to work well in practice.

mikepackard415 commented 3 years ago

Thanks very much for taking the time to share your work with us. I found your description of the slow adoption of Bayesian Optimization at Facebook really interesting, and I would love to hear more about how you convince your colleagues that this is a methodological improvement. You write "teams had difficulty expressing their goals in terms of specific tradeoffs ... and so we settled on framing the task as a constrained optimization problem." This ties in to some discussions we have had in our classes here about interpretable vs. "black box" methods. Do you see Facebook moving toward more "black box" methods, and thereby "optimizing" with less and less input from humans? Thanks again!

chiayunc commented 3 years ago

Thank you for coming to the workshop. My question is related to the implication of A/B tests. If Facebook is able to find better ad performance using A/B testing, do you think the end products, i.e. the ads, would ultimately converge to, or lead us to something that reinforces and perpetuate prejudice or bias? do you think that there could be any downfalls to this end?

MengChenC commented 3 years ago

Hi Dr. Bakshy, thank you sharing your work. It's really fascinating to see the application in the combination of Bayes optimization and A/B test. You mentioned some extensions and variations regarding how this technique can be used in other fields, would you be able to elaborate more in the potential applications and developments? Thank you.

ChivLiu commented 3 years ago

Thank you for sharing this presentation! It is very interesting to find out this application of Bayesian Optimization. I wonder that the algorithm might lead to some biased results based on the testing samples. On social media, we sometimes receive trending messages that we are not interested in only because we have connected to some sites or links via the browser. Do you think that the algorithm is generally sending ads and make profits?

Qiuyu-Li commented 3 years ago

Thank you so much for coming! My question is related to the use of the A/B test. I'm wondering how it is typically used in research, and furthermore, is there any situation where it can be applied in the real industry. Thank you!

jinfei1125 commented 3 years ago

Dear Dr. Bakshy,

Thank you so much for visiting our workshop! Your paper is really precise and reader-friendly ;D I really enjoy reading it! It's interesting to know that under the smooth operation of the Facebook app's News Feed feature there are so many tests and optimizations going.

I am a little confused by the concept of 'policy dimension'. Does this mean different policies? Is this the same thing with the parameters you tuned in the paper?

Second, I am interested in the News Feed Reccommandation algorithm, since we all know it's this recommendation algorithm that makes us so addicted to social media--do you use different algorithms for video, image, and texts? I also found the multi-armed bandit method intestine. Do you use Bayesian optimization for all recommendation scenarios or a hybrid of Bay-Opt with others?

I am also curious, that, forgive me if this question sounds so unprofessional and ignorant, after fine-tuning these parameters, how can we measure the increased profit for Facebook? Or just click rate is enough?

Btw, I think there are some text overlaps on pages 6 and 12. It's not a big deal and it doesn't impede our reading--I just want to mention it here in case you don't notice it! Overall, excellent and interesting introduction! Thank you again!

Anqi-Zhou commented 3 years ago

Thank you so much for the inspiring sharing! I've learned a lot of A/B test examples in consumer behavior and pricing strategies courses. I wonder if A/B test can be applied in economics field? What's the potential consideration then?

kthomas14 commented 3 years ago

Thank you so much for sharing your research with us, Dr. Bakshy. The application of Bayes Optimization to A/B testing seems like a slowly growing practice within User Experience Research and the other areas in which A/B testing is regularly used. Are there any other research methods similar to A/B testing that may lend themselves to the application of Bayes Optimization, and have you considered testing these applications?

qishenfu1 commented 3 years ago

Hi Eytan, thank you very much for sharing your wonderful work! I am wondering will the usage of the optimization methods you mentioned help reduce the number of treatments in an experiment, for example, by choosing a limited number of most effective treatments, and thus reduce costs? Thank you!

lulululugagaga commented 3 years ago

Thanks for your sharing! AB testing is widely applied in the industry, but the obstacle to use AB testing in the research to me is the scale and the format of the data. Although some tech companies open part of its database to researchers, do you think we can only rely on the data from tech companies or is there any other way to do the testing as a researcher?

mingtao-gao commented 3 years ago

Thank you for sharing your work and your presentation! It is a very interesting paper that reveals how we can apply machine learning models in user experience design. As you have presented cases using A/B testing on system tuning cases, I wonder how can A/B testing with Bayesian Optimization can be applied in research field?

YuxinNg commented 3 years ago

Thank you for sharing your work! I really enjoyed reading your research and hopefully I can hear more real application cases of A/B testing during the the lecture time. Thank you!

TwoCentimetre commented 3 years ago

Thanks for sharing. I want to know if there are any similar methods that can used in non-experiment researches. If I already have a survey dataset, can I apply this technique to this data?

hesongrun commented 3 years ago

Thank you for the wonderful presentation! This is really very important work. As the options of treatment grows exponentially, it is critical that we find some systematic way to experiment with all the options. How do you think about interpret the models? Is there any ways to safeguard against type-I error and improve the efficiency of treatment test? Thanks!

caibengbu commented 3 years ago

Thank you so much for the inspiring sharing! I've learned a lot of A/B test examples in consumer behavior and pricing strategies courses. I wonder if A/B test can be applied in economics field? What's the potential consideration then?

a-bosko commented 3 years ago

Dr. Bakshy,

Thank you for coming to our workshop! It was very interesting to learn more about the optimization of A/B testing, as well as how Bayesian optimization can be applied. Since online field experiments have become more and more popular, it is important to know how we can accurately and efficiently conduct these experiments.

Other than Facebook, where else can see the work in BayesOpt with A/B tests applied? Do you see this platform being extended to many other applications?

yutianlai commented 3 years ago

Thanks for sharing! Could you please elaborate more on the application of A/B testing?

Yilun0221 commented 3 years ago

Thank you for the presentation! I am fascinated by the topic, and I just wonder whether there is a possibility to combine computer simulation into your project so that bigger data size can be expected?

anqi-hu commented 3 years ago

Thank you for sharing your work with us. Other than the difficulty with scheduling and stopping batches, are there other shortcomings of AE?

linghui-wu commented 3 years ago

Thank you for sharing this exciting work. I am also interested in the application of the A/B test in scholarly research.

Bin-ary-Li commented 3 years ago

It is interesting how Bayesian experimental design can help in the internet industry where data is abundant. But can we use that in a research setting, where experiments are likely more costly and fewer data available?

XinSu6 commented 3 years ago

Thanks for so much for sharing your work. I am wondering if you can go a littlie deeper into A/B testing? Thank you so much and see you tomorrow!

ydeng117 commented 3 years ago

Thank you for sharing your work. I wonder how can current big data technology help the data collection process for A/B testing?

jsoll1 commented 3 years ago

Thanks for sharing your work! I'm interested in what kinds of specific things that seem like they'd be good for A/B testing that we should actually watch out for before implementing?

WMhYang commented 3 years ago

Thank you very much for sharing your work. I was wondering if the challenges in A/B tests could be avoided to some extent by design of the experiment, instead of by algorithms? I am looking forward to your presentation tomorrow.

hhx2207061197 commented 3 years ago

Thank you for your presentation, Dr. Bakshy. Do you have any suggestions for consumer experiment design?

wanitchayap commented 3 years ago

Thank you in advance for your presentation! I would already be satisfied if you answer Nak Won and Coen questions :)

xxicheng commented 3 years ago

Thank you for sharing your work with us. I am looking forward to tomorrow's talk.

NaiyuJ commented 3 years ago

Thanks for sharing your work! Since you're working at Facebook, I am wondering how this finding can help improve the operation in practice.

chrismaurice0 commented 3 years ago

Thank you for coming and speaking with us! A/B testing has been around since the early days of the internet. How much has this process for testing new software changed and how do you see the field of A/B testing advancing? Which companies are leading the way?

afchao commented 3 years ago

Thank you for presenting to our group! Last fall we heard a presentation from Dr. Berk Can Deniz about the general impact of A/B testing in promoting incrementalism such that the "pursuit of novelty" ultimately suffers. Do you share this concern? If so, would you consider the application of Bayesian optimization (or indeed any expansion on the basic A/B format) as holding some promise for expanding the scope of A/B testing?

weijiexu-charlie commented 3 years ago

Thanks for your presentation. What are your suggestions on applying A/B testing in other fields?

RuoyunTan commented 3 years ago

Thank you for sharing your work. As some of my peers have pointed out potential concerns for the use of A/B testing, could you elaborate on why this approach has evolved to be one of the most popular methods that companies have been implementing?

MegicLF commented 3 years ago

Thank you for sharing your research! I am interested in the impact of big data on A/B testing and what limitation may have if using A/B testing with big data.

NikkiTing commented 3 years ago

Thank you for sharing your work! With criticism of A/B testing, I wanted to ask if you think this method will be replaced in the near future. What do you think will be the way forward for methods to improve the user experience for digital platforms?

ghost commented 3 years ago

Could you please explain Bayesian optimization in more detail?

shenyc16 commented 3 years ago

Thank you so much for sharing this interesting research with us. The result is inspiring and the whole paper gives a clear description on how A/B tests work on Facebook News Feed. I am still confused about the relationship between Bayesopt and those specific techniques you mention in the paper. Could you explain more on it?

yierrr commented 3 years ago

Thanks so much for sharing! Besides the great questions listed above, I’m curious about how the framework may be applied to within-subjects designs. Thank you!

uchicago-computation-workshop / Spring2021

04/08 Eytan Bakshy #2