UChicago-Computational-Content-Analysis / Readings-Responses-2024-Winter

1 stars 0 forks source link

4. Word Embeddings to Explore Meaning Spaces- [E4] Murray, Dakota, Jisung Yoon, Sadamori Kojaku, Rodrigo Costas, Woo-Sung Jung, Staša Milojević, and Yong-Yeol Ahn. #37

Open lkcao opened 8 months ago

lkcao commented 8 months ago

Post questions here for this week's exemplary readings:

  1. Murray, Dakota, Jisung Yoon, Sadamori Kojaku, Rodrigo Costas, Woo-Sung Jung, Staša Milojević, and Yong-Yeol Ahn. 2023. “Unsupervised embedding of trajectories captures the latent structure of scientific migration.” PNAS 120 (52): e2305414120.
sborislo commented 7 months ago

The use of Word2Vec seemed to capture a wide array of reasons for international migration quite well. However, a lot of the "findings" seemed to be validations based on Word2Vec's ability to pick up on already-known phenomena. Although it performs well here, how can we know whether a program like Word2Vec will work well in other domains? If strange relationships are found (like the one between Iran and Ireland, if we didn't have the knowledge we do, would be), how do we know it's a genuine relationship worth exploring or just a meaningless artifact of the program? (like we've seen before)

Twilight233333 commented 7 months ago

The new tools used by the authors are impressive and show some good results. What I'm curious about is to what extent can we use these efforts to test whether cultural influences are significant in a particular area? For example, if the author's research proves that culture is important in the field of technological migration, can we transfer the relevant research to, for example, the United Nations? Does culture influence voting behavior in similar countries by analyzing votes in the United Nations General Assembly? If the results are significant, does it follow that culture matters in the UN vote? If the result is not significant, does it necessarily mean that culture does not matter in the UN vote?

cty20010831 commented 7 months ago

I personally think the use of word2vec to examine scientific migration, particularly about connecting it to gravity model to study migration beyond the geographical sense, was cool. But I am having trouble understanding how the authors deduce the prestige of universities using SemAxis. What is the statistical rationale behind?

floriatea commented 7 months ago

In the paper, it mentions that the institution higher in the embedding-based ranking tend to be relatively small universities near major urban areas such as the University of San Francisco and the University of Maryland Baltimore County, possibly reflecting exchanges of scholars with nearby highly ranked institutions at these locations. This analysis is not limited to the United States. However, among the ten countries with the most universities represented in the Leiden rankings, why would all country except for China have a Spearman’s p>=0.5 between their prestige axis and the relative rankings of their universities?