Week 4 - Possible Readings

lkcao commented 2 years ago

Post your questions below about: “The moral machine experiment”; “The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings.”; “Show and Tell: A Neural Image Caption Generator” OR “BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization”.

thaophuongtran commented 2 years ago

Question/comment on “The moral machine experiment (Links to an external site.).” 2018. Awad, Edmond, Sohan Dsouza, Richard Kim, Jonathan Schulz, Joseph Henrich, Azim Shariff, Jean-François Bonnefon, and Iyad Rahwan. Nature 563, no. 7729: 59-64: The results of the moral machine experiment are very fascinating as they collected millions of responses to moral dilemmas faced by autonomous vehicles across countries and territories. The paper documented variation in moral decision based on demographics, cross-cultural ethics, countries, modern institutions, and cultural traits. In addition, the data is publicly available which I'm excited to explore. I wonder if there are other dimensions that determine moral decision in regards to artificial intelligence that the paper should have touched on but did not? For example, I can also see the urban-rural divide and the exposure to ai to be important factors to considered.

pranathiiyer commented 2 years ago

For the paper on the moral machine experiment, it talks about this idea of consensus when it comes to morality. I can see why that might make sense for this specific case, but how does the idea of aggregating such subjective concepts fare in the ML ecosystem? is that the only way to develop these robust models of human judgement?

sabinahartnett commented 2 years ago

I'm curious as to what the discussion involved when building the dataset and methods for The Geometry of Culture, a few arguments stuck out to me specifically to suggest that, in a way, the use of books and formal text data as an indicator of cultural patterns is perhaps a form of negative sampling ("we propose formal text analysis as a promising avenue for recovering widely-shared understandings of class from historical populations no longer available for direct observation...Text is particularly well-suited to historical-cultural analysis, as it is often the most semantically-rich record a group leaves behind") (which is, of course, backed by sociological theories about text as a cultural object). I'm wondering, in the spirit of last week's class, what additional data you think would compliment this study well - either to increase the dimensions upon which we can analyze these cultural patterns or to reinforce the findings of these word embeddings?

yujing-syj commented 2 years ago

For the paper The moral machine experiment, ,my question is whether the moral issue could be a big problem for the future development of self-driving or other algorithm since the machine and the designer should make the decision of all the moral issues in advance? From the result of the moral machine experiment, we know that people have different view and value of the moral issues and there is no right answer for that. Thus, how can the future self-driving deal with the existing moral dilemma? Whether the algorithm for different countries should be different? When accidence occured, who should be responsible for that?

ValAlvernUChic commented 2 years ago

For the paper, Geometry of Culture, I was interested to know how we could align samples of multilingual data to get a more robust understanding of these cultural dimensions across different cultures. For example, are the concepts of class and its characteristics similar in the United States with the United Kingdom? If not, how would you recommend we handle data from these two different domains such that we can have sort of an aggregated understanding of their views on culture? Is it as simple as just training on both domains together?

JadeBenson commented 2 years ago

I think the Moral Machine experiment is so interesting and reveals meaningful cultural differences, but actually many similarities. Since this week is all about sampling, I'd love to discuss how we could design our own research to be this viral! This was so successful and unique due to gamifying the study which allowed a sample size otherwise unheard of. What are the benefits to this strategy and some limitations? Could we reproduce this virality? How?

egemenpamukcu commented 2 years ago

I found the Moral Machine experiment truly interesting and meaningful for starting a debate on whose ethics should we feed into intelligent machines we are building. It raises numerous important questions such as whether the ethical decision making of the same device (e.g. self driving cars) should differ based on culture and geography? How and how often should it be updated with the continuously evolving human moral landscape, considering AI's presence in our lives will only increase with time? Also, it seemed to me like looking for a globally acceptable set of ethical principles to guide our machines accepts moral realism (or moral absolutism) as a given. I have zero problems with this but I assume moral relativists disagree. Would it be fair or practical to accommodate moral relativism in machine ethics?

ShiyangLai commented 2 years ago

I am really impressed by BoTorch. I think it could be a good approach for efficient hyperparameter search. Unlike traditional grid search or random search methods, BO-based hyperparameter tunning can act more “smart” in the computing progress. I do want to learn the implementation of BoTorch in real computational social science research examples, and are there any useful web resources to get to know it more?

borlasekn commented 2 years ago

The Geometry of Class paper "provides suggestive validation for relational approaches to cultural theorizing". I understand the important of studying these things relationally, as nothing just relates to only one other thing in culture - things are deeply embedded. I was just wondering whether we really could study any aspect of culture geometrically? I'd be interested to hear people thought's on limitations of this sort of work (i.e. what aspects of culture could we not study geometrically?).

javad-e commented 2 years ago

Question about Awad et al. (2018): One concern that I have is regarding what the results are representing. The sample is not random. I understand the researchers are controlling for six personal attributes but there are many more important characteristics to consider. For example, those with backgrounds in computer science, law, and other backgrounds that are related to moral values and machines are likely to be overrepresented in the data because they are likelier to be informed of the survey and to be willing to complete the survey. So, can we still generalize the results to the cultures? It could be even more problematic if we believe that in some nations, the gap between the moral priorities of the elite and an average citizen is greater than the gap in other countries.

Emily-fyeh commented 2 years ago

The article "Show and Tell: A Neural Image Caption Generator" proposed a high-accuracy CNN+LSTM model to create an informative caption for the image input, which is a solid example of translating images to a textual representation. (The paper illustrates the model structure and fine-tuning parts pretty clearly!) I would naturally think of the application of this model:

Maybe the word context can be added to the input of caption generation since most of the case images would need captions when surrounded by texts--either within a book, news, or social media posts. The exemplary results in the paper show that the model may produce caption based on different aspects of the image, so maybe the textual content could help the model to select the correct/consistent/optimal caption angle.
Another application would be the image description for the visual-impaired. I think in this case, captions could focus more on the subject of the picture and provide more details about the pictures. (I would be interested in knowing more about this possible application

isaduan commented 2 years ago

The paper on Geometry of Culture relies on externally constructed wordlists as evaluative dimensions. Are there methods to automatically detect ideological differences that do not rely on external wordlists? Can we reliably measure the variance of ideologies/opinions, along dimensions that would surprise us?

BaotongZh commented 2 years ago

I am really interested in the paper "Show and Tell: A Neural Image Caption Generator", It actually answers my question posted last week. So, regarding this paper, the assumption of combining the CNN and RNN seems to be that we are dealing with a task of image and text. Can we extrapolate such an assumption that we should combine the models that are best fitted for their most performed area to solve a complex questions involving diverse data source and interdisciplinary purposes.

mdvadillo commented 2 years ago

I found the paper on the Geometry of Culture very interesting. I am curious to see if the use of binary classifications to train the model affects the results of the paper. Is there a lot of information being lost by focusing on two clusters, one at each end of the spectrum, that could be recuperated if we look at say three dimensions instead of two? As in, instead of rich vs poor, would having rich vs middle class vs poor affect the results significantly?

zihe-yan commented 2 years ago

I am also interested in the paper on Geometry of Culture, especially how the authors connect the computational method with social theory. Adding upon previous questions, what will happen if I adopt this method in a comparative analysis, for example, to explore differences in the meaning of the word class in a liberal country and a non-liberal country? Since I am likely to handle different languages and cultural contexts at the same time, what should we do to make this research more methodologically robust?

yhchou0904 commented 2 years ago

I am especially interested in the paper "The moral machine experiment." The paper discusses the social expectation of the whole world and different regions on autonomous vehicles, which makes me think of the objective function of reinforcement learning. We often hear about the problem of choosing inappropriate objective functions would lead to a result we don't want. The "social expectation" the paper addresses is a way for manufacturers and designers to decide the objective function of the algorithm for autonomous vehicles. When dealing with these kinds of models, how to balance the degree of how well we meet these kinds of social expectations and the feasibility of the model's objective function?

y8script commented 2 years ago

'The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings' reminds me of the interpretation of word embeddings that I always care about. I wonder whether there could be a universal approach to explore the meaning of dimensions or a combination of dimensions through careful social science inspection like this one, which can be transferred to any use case of this model to reveal a piece of the puzzle about the actual content of the embeddings? If there is enough exploration on this side, can we eventually effectively interpret a widely-used pre-trained embedding model?

min-tae1 commented 2 years ago

I am interested to see the researchers of "The Moral Machine Experiment" employed the world value survey, which faced some criticism from scholars, to classify countries. I was curious if we employ deep learning to the data, would it lead to a similar categorization to that of the world value survey.

Yaweili19 commented 2 years ago

For the paper The moral machine experiment: This is a great paper that brings light on the international differences of ethical expectations in Machine learning. However, as some pointed out, I would suggest gaining more information from other surveys, instead of using just one, to support its conclusions. It would also be a great idea to illustrate on some shortcomings of WVS and what methods were used to make up for them.

hsinkengling commented 2 years ago

In the show and tell paper, I find the sentence structure generated by the algorithm quite interesting. It mostly focuses on the object rather than the background and follows a complete sentence structure with one noun, one verb, one object, or a prepositional phrase. I wonder if this structure is influenced by the training data (instructions for the human labeler) or the model design? I also wonder if changing the sentence structure would allow better accuracy? (since a lot of the inaccuracies happen at the verb or prepositional phrase, while the subject noun is usually correct)

Hongkai040 commented 2 years ago

For the moral machine experiment paper, I am thinking of treating the data collection process as a data sampling process. However, this is by no means a random sampling, and I am wondering does it matter for such an experiment? For example, due to the nature experiment, people who received higher education may be easier to access to this platform and do this test. For those less educated, they may be automatically discriminated, if our target group is about whole population of different countries.

linhui1020 commented 2 years ago

The moral machine is really an amazing paper. Despite the representativeness has some concerns, as the authors indicate, the scalability of this research reveal different clusters of culture preference which further inspires the algorithm design of self-driving car. My question is that if we want real adopt such algorithm in ethical decision-making when developing self-driving car, shall we rely on the universal results from all countries, or implement different model for areas and regions with different social preference?

Thinking-with-Deep-Learning-Spring-2022 / Readings-Responses

Week 4 - Possible Readings #10