1/27: Ur - Githubissues

uchicago-computation-workshop / Winter2022

Repository for the Winter 2022 Computational Social Science Workshop

5 stars 0 forks source link

1/27: Ur #3

Open ehuppert opened 2 years ago

ehuppert commented 2 years ago

Comment below with a well-developed group question about the reading for this week's workshop. Please collaborate with your groups on Hypothesis (via the Canvas page) to develop your question.

One person can submit on the group's behalf and put the Group Name in the submission for credit. Your group only needs to post on assigned week (rotating every other week).

Please post your question by Wednesday 11:59 PM, and upvote at least three of your peers' comments on Thursday prior to the workshop. You need to use 'thumbs-up' for your reactions to count towards 'top comments,' but you can use other emojis on top of the thumbs up.

jinfei1125 commented 2 years ago

Group 2K: Baotong Zhang, Senling Shu, Jinfei Zhu, Koichi Onogi

Hi Blase (yes I have a look at your personal website and find your preferred name!), thanks for coming and presenting this interesting and amazing work! I think it’s very relevant to our daily life and everyone has probably felt those targeted advertisements are either useful or creepy (or both). A clarification question is that are those targeting types/user attributes can be found on the user data downloaded from Twitter or are they defined by researchers? Will this be a commercial secrete of Twitter? (Because Twitter is free to use, I guess its main income source is advertising?) I believe Twitter must also have a research team, do you think they are trying to improve their users' perception?

PS: There is the link to request your data from Twitter: Twitter Data Download I don’t use Twitter a lot but if you use it intensively, you may find it interesting to look at your data

For most of us, this Instagram data download link might be more relevant: Instagram Data Download I received Data like in 10 minutes after I submitted my request. I think this could be another research topic. I never posted on Instagram that I am a data scientist and study computational social science, but in the file advertisers_using_your_activity_or_information.json, I found some pretty familiar names like

"advertiser_name": "Tableau Software",
"has_data_file_custom_audience": true,
"has_remarketing_custom_audience": false,
"has_in_person_store_visit": false
},
{
"advertiser_name": "DataCamp",
"has_data_file_custom_audience": true,
"has_remarketing_custom_audience": false,
"has_in_person_store_visit": false
},
{
"advertiser_name": "Coursera",
"has_data_file_custom_audience": true,
"has_remarketing_custom_audience": false,
"has_in_person_store_visit": false
},

Isn't this interesting? I would love that Instagram pushes me some discount information on these websites, but I would definitely also find it a little creepy :)

Thiyaghessan commented 2 years ago

Group 2A: Thiyaghessan, Eliot Weinstein, Linhui Wu, Sushan Zhao

Hi Blase,

Thank you for coming to speak with us today, we enjoyed reading your paper. Much of the discourse about privacy today centres around providing users with more information. And indeed, your paper shows how increasingly detailed descriptions of advertising techniques can empower users to make better choices for themselves. However, this approach doesn’t scale well across the ever-increasing number of apps/platforms all of us will continue to have.

Sure, you could say that these apps are optional, and we don’t have to use them but that’s because we are all old enough to remember a time when these apps didn’t exist. For future generations, these platforms will come to constitute a necessary part of their social lives and they will have to participate in them or be locked out of certain aspects of human interaction. Is it reasonable to have everyone continuously read and update themselves on all the developments in the advertising space to maximise privacy protections? Many of the tech companies would frame it as a matter of personal choice but there are upper bounds to how much information we can process when deciding.

At the same time, I think understanding informative descriptions, like the interventions in the study, presupposes a certain level of digital literacy which certain groups may not have (e.g., children who will definitely use social media regardless of how hard parents try and regulate their use) and these are the groups which are most vulnerable to the ill-effects of excessive targeted advertising (e.g., kids using their parents' credit cards to buy stuff online). It seems that the solution in that context would be stricter regulation, but do you have any other ideas on how we can bridge this gap in knowledge/protections?

xzmerry commented 2 years ago

Hi Blase,

Our group (Group 2I: Daniela Vadillo, Lingfeng Shan, William Zhu, Zimei Xia) has the following questions:

And for the research methods/design, we have questions as follows:

Generalizability: What is the advantage of a survey-based experiment on Twitter over a hypothetical scenario? Admittedly, this experiment provides us with data for quantitative research, however, it also suffers from limited generalizability compared with a hypothetical scenario. How to generalize this research to patterns other than platforms like Twitter (but also other platforms/apps. Because different types of apps may have different ad patterns, so whether the research result is generalized enough here)?

Besides, we have questions about the possible future development of related research topics and the implication of this research:

It seems that personalized ads face a privacy–efficiency trade-off: On the one hand, when ads are too generalized (not personalized enough), users feel less concerned about privacy intrusiveness, but the efficiency of the ads is low (it is not efficient at recommending users products or services that they may find valuable). On the other hand, when ads are too personalized, users are more likely to see and click through useful ads, but the risk of feeling concerned about privacy also increases. What metrics can we use to find the optimal balance, where the personalized ads are efficient but not too privacy-intrusive? (e.g. ad click-through rate/complaint rate.)

(Some possible answers we are thinking/discussing: companies could, theoretically, “personalize” the percentage of personalized ads one user sees to mitigate users’ privacy concerns. As for advertising transparency, maybe each user should be given the option to exclude their data from companies’ analytics as well as being marketed as a commodity on the data market.)

Is there a more specific/systematic categorization of targeting types? Such as personal attributes vs personal social networks.
Considering different perceptions regarding different targeting types, maybe the importance of advertising transparency varies in terms of various targeting patterns. So in research, whether it is worthwhile to prioritize some types of advertising transparency over others?
Whether discussing different types of targeting types will be an important research topic in the future?
What are some incentives on the side of social media companies or companies with an online presence to be more open about their ad targeting practices? With the information given, it seems like they are just following regulations and trying to be deceitful to avoid a negative public reaction. Also, we don't think it was covered in the intro of the paper, but do we know why so many other targeting techniques have been excluded from academic research? Is it because there was not enough information available to conduct a study or was the research question not too appealing?

chuqingzhao commented 2 years ago

Group 2D: Yijing Zhang, Chuqing Zhao, Mike Packard, Alex Williamson

Thank you for sharing your work with us! Our group has the following questions:

Is it possible that a follow-up experimental design could bring additional accuracy or confidence to the measurement of how people like different ad explanations? This could potentially illuminate the causal relationships more specifically between explanations and users’ experience.
Should explanations for advertisements depend to some extent on the kind of advertisement in question? For example, it might make sense to have different kinds of explanations (in content and quantity) on ads for products or simple purchases than for political advertisements or propaganda.
How did you define fairness? If fairness is loosely defined, when you ask the survey participants whether they think the given targeting type is unfair, then the definition of fairness might differ person by person and the results might be hardly interpretable.

TwoCentimetre commented 2 years ago

Group 2M: Chenming Zhang, Chris Maurice, Xin Su, Yujing Sun

We have questions related to the data provided by twitter. First, do these data files only include user behaviors on twitter? We notice that apps on our phone also scratch our data from other channels and other apps. For example, apps could grab the search history from browsers, they can also monitor certain kind of behaviors when we use other apps. So, we are curious if the data provided by Twitter only include users' behaviors on twitter.

Second, do these data files only include text data? We find some people share their experience that sometimes apps seem to monitor users' chats and conversations. Some people say that sometimes apps will push some ads about things they just talk about, which is creepy. So, do such information also provided?

Third, we all have different privacy settings on our phones which could block certain behaviors of apps. So, we wonder if such information is provided by those participants and if such information is taken into consideration by this research.

Sirius2713 commented 2 years ago

Group 2F: Wenqian Zhang, Gabriel Nicholson, Xin Tang, Sophie Wang

Our group has two questions about this week's paper:

In the paper, the authors compared their own explanations with the ones of Twitter, and they found participants Twitter's explanations are not useful and comprehensive enough. So why don't Twitter provide better explanations? Does it try to make advantage of the ambiguous language? Is there a gray area or loophole here that Twitter can use more sensitive information for targeted ads without explaining to users?
The paper mentions there are different targeting types or classes of attributes ranging from age, gender etc to audience lookalikes, advertiser-uploaded lists of specific users etc. Which ones are more important in different scenarios? How do people decide it?

helyap commented 2 years ago

Hi Blase,

Thank you for sharing your research with us! It was an interesting read and shed more light on ad targeting techniques that are typically opaque. Furthermore, while concerns of generalizability and self-selection were raised from your data, the ethics of your data collection leveraging rights of access was appreciated especially as it made use of personalized ad data directly provided by compensated participants.

Our group has the following questions:

The applicability of the findings are of concern as the sample size was small with only data collected from Twitter. Since the conclusions are not necessarily generalizable, could the study be repeated on other social media platforms to expand the generalizability of the findings?
Also, more than 80 percent of those surveyed said they had a college education, and about 20 percent said they had an IT background. Will these proportions interfere with putting the experimental model to work in the real world, since the level of education among Twitter's large user base may not be as high as in the sample.
Did you find any effects on trust and participant reactions to ad techniques from company brands? Since the ads were personalized, there is the assumption that it is difficult to control for participants prior knowledge and perception of the brands they see. Or do consumers already disentangle their perception of ad techniques/advertisers from the brands that are marketed in the ads?
With perhaps greater possibility for experimental control, is it allowable to create mock user accounts on the platforms to test and describe ad targeting techniques?

Many thanks and we are looking forward to your talk tomorrow!

Best, Group 2J (Kuitai Wang, Zhe Zhang, Emily Yeh, Helen Yap)

jsoll1 commented 2 years ago

Group 2B: Justin Soll, Hongkai Mao, Coco Yu, Wanxi Zhou

We wonder if there's any conflicts of interests between advertisers and users regarding the transparency mechanisms such as accuracy. If so, how can we find a balance between the interests of both groups? Also, although it's argued in the paper that accuracy can be leveraged to justify invasion of privacy, can't it be problematic as well if targeting instances are inaccurate?

Additionally, we are quite interested in the result that participants found explanations of Ad from researchers containing more detail to be more useful to Twitter's. As the provider of Ad space and creator of algorithms for targeting, how come Twitter's explanations are less explicit than that of researchers' who can only do reverse engineering? What does this result imply?

hshi420 commented 2 years ago

Group 2C: Lu Zhang, Fengyi Zheng, Haohan Shi, Taichi Tsujikawa

A part of the research focused on the users' reactions to different kinds of ads explanations. Although the social media companies' explanations seem to be "imperfect" for the users, they might be "perfect" for the companies. Are the users' reactions expected by the companies? Also, how were the companies' ads explanations developed? I think it would be good to interview some insiders about the mechanisms behind ads explanation development.

nijingwen commented 2 years ago

Group 2L: SiRui Zhou, Jingwen Ni, David Xu, Allison Towey, Alex Przybycin

Question1: we had questions about the FOLLOWER LOOKALIKES targeting which targets users who share similarities in interests with followers of a certain account. Could this be leveraged to create a digital twin of an average follower and target users who fit the attributes of a typical follower? Potentially allowing for designing of potential followers?

Question2: we found it interesting that tailored audience (lists) which match Twitter users to lists uploaded by advertisers and tailored webb were seen by so many of the participants as not being to their specific want or need. we would like to know more about this tailored advertising and why it seems to be ineffective in appropriately reaching their intended audience.

Question3: we are also interested in learning more about the survey results within heterogeneous groups, of different gender, race, educational background, income, etc. to see if these targeted ads have the potential risks of increasing inequality and discrimination.

egemenpamukcu commented 2 years ago

Thank you for sharing your work Professor Ur,

Our group would like to ask some questions about your paper.

Firstly, from the business and incentives standpoint, do you think there is much difference between showing users engaging, non-ad, content (Tweets, posts etc.) that is personalized based on their available personal information versus showing commercial ads using the same kind of information? After all, from the platform's point of view, the purpose of that content is also to prolong the user's engagement with the app, therefore, to show as many ads to the user as possible. If personalized content and personalized ads both serve the same purpose, albeit the former less directly, would only mitigating personalized ads free these platforms from all ethical blame? Couldn't we make the same argument for personalized content, which is the very core of their products?

Secondly, do you think existing data access rights and transparency policies of these platforms should be taken to the next level? Your study reminded me of this ad by Signal (who are positioning themselves as a pro-privacy app in the messaging space) on Facebook, which was consequently removed by the platform: download For instance, do you think making platforms provide such information with each ad they display could make these platforms safer in the eyes of their user base?

Finally, do you think the recruitment process for your study might have created a sort of selection bias? We understand that users had to share their personal ad targeting information with you for recruitment (understandably). However, we were wondering if you think this led to your study oversampling people who are more open to, and less concerned with sharing their private information. Perhaps, if a completely random selection were possible, some of your results could have indicated more concern towards perceived intrusion of privacy.

Kindly,

Group 2H: Egemen Pamukcu, Ning Tang, Taize Yu, Gin Zheng

kthomas14 commented 2 years ago

Dear Professor Ur, thank you for coming. In your essay, you've mentioned that "Finally, our results suggest it is insufficient to simply require data processing companies to make information available."

Would simply providing more information be sufficient in and of itself? I wonder if users also need to have the choice to block (certain) ads and advertisers from accessing their data? We wonder if you are able to comment on implementations such as Apple's privacy policies, where it asks users if they want to share information with certain applications.

Additionally, while we definitely agree that it is beneficial to provide increased transparency to the data collection, what will we ultimately gain from this? We know what companies are collecting, but we do not know how to stop them from collecting it without eliminating online presence (and even then it seems clear from the study that companies have ways of obtaining offline data). It does seem "creepy" that companies are able to obtain data such as cloud registration or starting a credit card application. We wonder what you might propose as far as the fairness of collecting data such as this in the first place, as well as what you think should be done to possibly regulate the blackbox information that we are not even remotely aware that we are sharing.

Thank you!

Group 2G: Kaylah Thomas, Shengwenxin Ni, Awaid Yasin, Yao Yao

wu-yt commented 2 years ago

Thank you so much for completing such an interesting research work! In your research, you mentioned that the data is combined using both online and offline data. And some data is obtained using PII to link to their profiles and provided by third-parties data brokers. In the section 2 of Twitter ad explanation, what would participants respond if you include the specific details of the sources of data to the ad explanation? For example, “You are seeing this ad because… your information is matched using your phone number with your criminal records”. This would make a “creepier ad explanation”. Group 2E: Franco Mendes, Nikki Ting, Brenda Wu, Juno Wu

FranciscoRomaldoMendes commented 2 years ago

Nikki Ting, Juno wu, Franco mendes. if researchers were trying to find when and how accuracy crossed the line from useful to creepy, how will personal preferences being taken into account. Since people from different cultural groups or with other different features might have different levels.

The participants of the study appear to be generally more educated and have more background in computer science or IT than the general population. How do you think the overall data (e.g., the number and types of ads and targeting) as well as the results (i.e., on perceptions) would differ for the wider population? Were ads involving sensitive targeting included in the study? If they were, were users’ reactions and perceptions more negative than for other targeting? Also, were participants informed that these possibly violated Twitter’s policy prohibiting targeting on sensitive attributes?