Open JunsolKim opened 2 years ago
The authors of the OpenAI blogpost mention possible implications for society, including both positives (e.g. AI writing assistants, better speech recognition systems) as well as negatives (e.g. generating fake news and impersonate others online). Is there a debate among scholars engaging in text generation about the responsibilities one carries when developing these new methods that could contribute to the demise of liberal democracy? Are there ways in which one can create such AIs in a better or worse way, and is there some sort of standard scholars hold one another accountable by?
I'm a bit skeptical of the dangers AI text generation poses regarding its potential to generate fake news and brain wash the masses. There is already a ton of fake news out there that people are readily able to acquire, the only difference with this algorithm is that there would be a way to scale it up. But in my mind, I don't see a difference between reading one fake news story or reading 10, especially when the logic underlying the stories involves an impossible conspiracy theory that I can't believe no matter how many times people show it to me (I.e., Vaccines are dangerous, the government is out to hurt us, etc.)
Hugo Mercier has written a book on this topic called "Not Born Yesterday" that outlines research that supports this general conclusion that people aren't very susceptible to mass media fake news stories. Given this, it seems to me logical to release the algorithm since it can do more good than bad. Just my thoughts.
Based on my previous understanding of conversation analysis, it seems that the Cornell CA package doesn't seem to provide the structure for the kind of detailed transcripts required in standard CA research. For example, standard CA transcripts typically include silences (in seconds), sound stretches, overlaps, filler words, among other details that are important for analyzing social interaction.
Granted that conversation analysts usually do not consider online comments as a type of conversation. Still, I'm wondering if there are computational implementations of conversation analysis that, at least in terms of data structure, are closer to the qualitative, detail-oriented tradition?
Adding on Konratp and Halifaxi's comments, I agree that text generation models like GPT are doing more social good than bad. Even though text generation algorithms can speed up the production of misinformation and probably scale up its propagation. The problem of misinformation has already existed with or without text generation algorithms. On the other side, such algorithms can also encourage research about misinformation detection, which will help us distinguish and eliminate (ideally) misinformation online.
Just playing devil's advocate in the discussion, I imagine that beyond a utilitarian "more good than bad" framework, we should also interrogate whether the bad is qualitatively an acceptable risk, despite the good. I find myself always looking back on Foucault's theories on discourse and with that, hold a more cautious/suspicious attitude toward technologies that could exert that type of discursive power on scale. While fake news and misinformation are more readily debatable/obvious to sift out, I immediately thought about the more latent, insidious, and discreet biases that could be propagated (either accidentally or purposefully, both bad). - " Discourse transmits and produces power; it reinforces it, but also undermines and exposes it, renders it fragile and makes it possible to thwart'". If we believe that power relations are produced, then I'd be more hesitant! Just my two cents.
I think the discussions above demonstrate the urgency for accelerating human-centered AI. How to leverage human knowledge and supervision in the technologies and keep them always in the loop as oppose to learn and replace.
I'm a bit skeptical of the dangers AI text generation poses regarding its potential to generate fake news and brain wash the masses. There is already a ton of fake news out there that people are readily able to acquire, the only difference with this algorithm is that there would be a way to scale it up. But in my mind, I don't see a difference between reading one fake news story or reading 10, especially when the logic underlying the stories involves an impossible conspiracy theory that I can't believe no matter how many times people show it to me (I.e., Vaccines are dangerous, the government is out to hurt us, etc.)
Hugo Mercier has written a book on this topic called "Not Born Yesterday" that outlines research that supports this general conclusion that people aren't very susceptible to mass media fake news stories. Given this, it seems to me logical to release the algorithm since it can do more good than bad. Just my thoughts.
I think the danger of Persuasive AI is a function of AI capability + human trust in it. Right now, most people have little trust in texts generated by AI. But if AI are increasingly capable at producing useful information and smart ideas, people could trust AI a lot more. Then persuasion can be dangerous either in the case of malicious actors manipulating people's beliefs, or in the case of locking suboptimal values and degrading collective rationality.
As I understand it, a GPT-2 model need not be trained on a specific context, rather performs poorly when trained on a highly esoteric context, yet, the number of tries it takes to generate a meaningful result depends on its training context. While this seems intuitive, I wonder for something as computationally intensive as the GPT-2, how does one look at this trade-off?
I remember the talk from Google I/O last year where researchers emphasised on how intelligent 'LaMDA', their conversational AI, was at generating dialogues. They also talked about real life conversations, how it is open-ended and a difficult problem to solve, and how this model outperformed other models like GPT2 at conversations (Their older conversation model Meena (2020) was already outperforming GPT2). I don't understand what the fuss with 'conversations' is about. While it has some sense of open-endedness to it, is the task that different from context based text generation. Why do attention based models with huge number of parameters (GPT3) perform incredibly well with text generation (write poems, articles, songs) but falter at conversation?
Regarding GPT2 and GPT3, I actually don't see too many social science papers using these advanced methods to do text analysis. Most of the articles I've read are using BERT or word2vec. Are there any specific reasons? Is it because GPT2 and GPT3 are too top-notched or are there any inefficiencies when applying to solve social science questions?
While the evaluation table in the Zero Shot section provides encouraging results on GPT-2 performance, I'm wondering what the metrics (i.e. "perplexity" and "bits per character") actually measure, and what's the rationale of such metric choice?
I'm fascinated by how this team understands the algorithm's political and societal implications. I wonder how OpenAI collaborates with industry and legislators to build algorithms that are beneficial to society, as well as how to decipher this policy framework.
I second Naiyu's question. Would appreciate more explanation on possible social science applications of GPT2/3, and it would be great if we could be directed to some recent social science papers adopting the models.
I'm quite interested in the performance of neuro-symbolic AI in language model application.
I am thinking of using the goodness of GPT-2 to combat for its badness. If GPT-2 is powerful at generating texts that could almost natural(so that it's may be maliciously used to generate fake news), can we use one GPT-2 to generate fake news and another GPT-2 fights against it to identify the fake news generated?
I enjoyed reading the comments up top about the ethical implications of something like GPT-2. Maybe I'm a bit of a luddite at heart, but I guess the question I think should be central to the debate is: how do these kinds of models make life better for humans? Make us happier, healthier, more sustainable. I don't always see that connection, and as social scientists I think it is incumbent on us to make that connection clear.
I've seen GitHub projects that can automatically generate pages of novel given a few lines of sentences. I wonder if the GPT models can perform simple creative works in the future? If so, what's the application or influence on social science?
I think the danger of Persuasive AI is a function of AI capability + human trust in it. Right now, most people have little trust in texts generated by AI. But if AI are increasingly capable at producing useful information and smart ideas, people could trust AI a lot more. Then persuasion can be dangerous either in the case of malicious actors manipulating people's beliefs, or in the case of locking suboptimal values and degrading collective rationality.
Great discussion! However, I don't think that trust in AI is just a matter of people's receiving useful information. For example, some studies show that people transfer their opinions from one framework to another. For example, if the system is designed by the government, and/or if you are part of a marginalized community that has suffered historically oppression from a particular branch of power, the reasons for trusting/distrusting those systems could be very different. At the same time, the fact of people distrusting AI systems could be a good signal too (for instance when talking about the dissemination of misinformation). If people keep this alertness and agency toward technological systems and institutions this afterward could be positive in terms of accountability toward the companies and governments that design those algorithms.
Adding on Konrat's comments. I agree that AIs can be dangerous, just like the dystopia stories in some sci-fis. However, I think at least for now, the problems are still about human ethics. In our next week's readings, there's one paper talking about using AI to identify homosexual people. This paper was an example of ethically contentious papers in my first quarter's course. It actually shows how AIs are just working as researchers' "extensions", or "enlargers" of egoism and prejudices.
I was wondering, how large should the conversation dataset be when fine tuning the GPT-2 model? Also, I have noted that one caveat of this method: GPT-2 models' robustness and worst case behaviors are not well-understood. As with any machine-learned model, we should carefully evaluate GPT-2 for each use case, especially if used without fine-tuning or in safety-critical applications where reliability is important. Therefore, I was wondering how can we improve the robustness of GPT-2 models?
I would wonder to what extent would traditionally all-human decisive fields, such as legislation or policy-making, would embrace the OpenAI or other forms of artificial intelligence in the future, and how do they leverage the possible risk and benefits of this trend?
In "Language Models are Unsupervised Multitask Learners," I am a bit unsure of how GTP2 and GTP 3 in the article are different from these other ways to analyze text. How is the unsupervised learning the authors are referring to fundamentally different from the topic modeling, word embedding, and deep learning models we used?
I've seen projects to create a song with text data of a few songs. I understand that AIs have no way to be innovative and creative since they just learn patterns underlying lyrics and music and replicate them. However, I wonder if AIs could learn pattern of innovation and creativity in music history and be "innovative".
Post questions here for this week's fundamental readings: Cornell Conversational Analysis Toolkit (ConvoKit) Documentation: Introductory Tutorial; Core Concepts; the Corpus Model.
OpenAI. 2019. “Better Language Models and Their Implications”. Blogpost.