Thinking-with-Deep-Learning-Spring-2024 / Readings-Responses

You can post your reading responses in this repository.
0 stars 1 forks source link

Week 6. Apr. 26: Auto-encoders, Network & Table Learning - Possibilities #12

Open JunsolKim opened 2 months ago

JunsolKim commented 2 months ago

Pose a question about one of the following articles:

DeepWalk: Online Learning of Social Representations”, 2014. B. Perozzi, R. Al-Rfou, S. Skiena. KDD.

Quantifying social organization and political polarization in online platforms,” 2021. Waller, Isaac, and Ashton Anderson. Nature.

Social centralization and semantic collapse: Hyperbolic embeddings of networks and text,” 2020. Linzhuo Li, Lingfei Wu, James Evans. Poetics.

Disrupted routines anticipate musical exploration,” 2024. Kim, Khwan, Noah Askin, and James A. Evans. PNAS.

maddiehealy commented 1 month ago

300-400 Word Summary/Reflection: This week I read an article on Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach that introduces a method called Hyperbolic GRAph Meta Learner (H-GRAM) designed to address the limitations of hyperbolic neural networks (HNNs) in scaling to large graph datasets. Traditional HNNs are well-suited to capture hierarchical data structures, but are limited when it comes to generalizing to new tasks and scaling. The proposed H-GRAM resolves these limitations by utilizing meta-learning techniques. These techniques allow the network to learn from small, localized subgraphs before then applying this knowledge to larger (often disjoint) graphs. With H-GRAM, the potential applications of HNNs significantly increases. In terms of social science research, H-GRAM allows new possibilities for analyzing complex networked data (social networks, citation networks, hierarchies/dynamics within large organizational structures, etc). Essentially, any area that focuses on understanding hierarchical interactions and tracking the spread of influence. For example, I can see this being applicable to analyzing Twitter data (and any data that records interactions). Practically, interactions would form a graph where users = nodes and interactions = edges, a setup conducive for H-GRAM. Through this setup, we could analyze key influencers via subgraphs or pivotal tweets. Ultimately, we would better understand how certain topics gain attention over time, perhaps even tracking their developing polarizations. Moreover, H-GRAM suggests a shift toward more dynamic and adaptable neural network models that can handle real-world data without needing complete datasets. This adaptability could transform data science strategies used in predictive modeling and real-time data analysis, offering tools that are not only more efficient but also more reflective of complex real-world data structures. Overall, H-GRAM's approach to learning from local graph structures and generalizing to broader contexts could revolutionize how we understand and predict patterns within large and complex datasets.

Question: This article comes from the perspective of promoting H-GRAM as an improved extension of HNNs. I would be interested to see specific performance metrics used to compare H-GRAM and HNNs. I'm also interested in H-GRAM's adaptability – it clearly allows for an extension of studies, specifically to more "social" data through the meta-learning approach, but how flexible is it to rapidly changing data, like a social media feed, in practice?

kceeyang commented 1 month ago

300-400 word reflection:

After reading Chapter 12, I was eager to learn more about the difference between the GNN inductive and transductive learning approaches and how they can be applied to real-world situations. In the article “Inductive-transductive learning for very sparse fashion graphs”, the authors introduced a mixed GNN learning strategy to deal with a large, sparse graph. They demonstrated an industrial use case where a fashion item dataset was being analyzed for a link prediction task. Their detailed explanation of the difference between inductive and transductive has answered my previous question in the orienting reading post. Since the predictor in inductive learning can generalize to new nodes that were not present during the training process, the use of inductive learning methods would have a greater effect than transductive learning on continuously evolving graphs with new nodes, like new posts/comments on social media, and would generate embeddings of unseen nodes quicker for more real-world applications.

The authors then put forward a combined inductive-transductive DGN (deep graph network) approach to address the challenge of link prediction on large, sparse graphs. In their application scenario, the graphs represent various fashion item categories, with the nodes and edges denoting the products and the relation between the product and the associated assortment. Their proposed framework first trained an inductive model to generate new local structures for the disconnected or sparsely connected nodes. This was followed by an inductive enrichment step to enhance the performance of the later transductive link prediction tasks. The innovative aspect of this approach lies in how it leverages the inductive learning step to boost the quality of the node representations learned by a transductive learning process, thereby significantly improving the accuracy of the resulting prediction of compatibility between fashion items compared to the standard model without enrichment.

The potential of this mixed inductive-transductive approach is not limited to fashion item datasets. It can be effectively applied to any link prediction task in very large, sparse social networks. Consider a dataset with a follower-followee relationship, where the networks can be vast and sparse. For instance, the active users (followers) on a social media platform can be multitudinous, but the number of people who follow one user account (followees) can be very few. In such cases, an inductive-transductive approach could be a viable solution for a link prediction problem.

Pei0504 commented 1 month ago

Considering the ability of DeepWalk to generate social representations from network vertices using truncated random walks, how effective would DeepWalk be in adapting to rapidly evolving social networks where vertex connections frequently change? Specifically, could the latent representations generated by DeepWalk be efficiently updated without complete retraining, to reflect these changes dynamically, thereby maintaining accuracy in social network analysis tasks like community detection or anomaly detection over time?

guanhongliu2000 commented 1 month ago

I would recommend the article "Auto-Encoders in Deep Learning—A Review with New Perspectives" by Shuangshuang Chen and Wei Guo presents a comprehensive review and new insights into the field of auto-encoders (AEs), a pivotal component of deep learning. This paper adeptly encapsulates the essence of AEs and their evolving role across a spectrum of applications, making it a significant contribution to the literature on the subject.

Firstly, the article is commendable for its structured approach in explaining the fundamentals of auto-encoders, starting from the basic concepts and progressively delving into more complex variants. This systematic exposition is particularly beneficial for both newcomers to the field as well as seasoned researchers, as it provides a clear pathway through the potentially dense topic of neural network architectures and their applications. The authors begin by defining AEs and then explore various types such as denoising auto-encoders (DAEs), sparse auto-encoders (SAEs), and variational auto-encoders (VAEs), among others. This segmentation not only elucidates the operational mechanisms of each variant but also highlights their specific utility in different scenarios.

Another strength of the article lies in its extensive review of the applications of AEs across diverse fields such as pattern recognition, computer vision, data generation, and more. By providing examples of practical implementations, the authors successfully demonstrate the versatility and robustness of AEs. This not only reinforces the relevance of AEs in current research and industry applications but also stimulates ideas for future innovations.

Moreover, the paper stands out for its forward-looking perspective, particularly in the sections discussing future trends and challenges. Here, the authors not only summarize the current state of AE technology but also identify key areas for further research, such as integration with other deep learning models, improving training efficiency, and enhancing unsupervised learning capabilities. Such insights are invaluable for guiding ongoing research and development efforts, ensuring that the academic and practical pursuits within the AI community remain aligned with the evolving technological landscape.

The inclusion of a detailed analysis of the relationship between AEs and other deep learning models like deep belief networks (DBNs) and convolutional neural networks (CNNs) is particularly enlightening. This discussion not only helps in understanding the comparative strengths and limitations of AEs but also situates them within the broader framework of neural network research, emphasizing the interconnectedness of these technologies.

Critically, the article does well to address the challenges and limitations inherent to AEs, such as the difficulty in training deep models and the risk of overfitting. By not shying away from these less favorable aspects, the review provides a balanced view that enhances its utility as a reference. It encourages researchers to not only leverage the strengths of AEs but also to innovate on their shortcomings.

In summary, this article by Chen and Guo offers a thorough and insightful examination of auto-encoders within the field of deep learning. Its comprehensive coverage of both foundational concepts and advanced topics, combined with a forward-looking perspective on challenges and innovations, makes it an exemplary piece on the subject. It serves both as an educational tool for those new to the field and as a springboard for future research, highlighting the dynamic and evolving nature of auto-encoders in artificial intelligence.

00ikaros commented 1 month ago

Regarding the DeepWalk: Online Learning of Social Representations, would the distance between adjcent vertices impact the random walk?

Xtzj2333 commented 1 month ago

'AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction'

The LLM model in this paper shows impressive accuracy in predicting missing data in large-scale surveys that have binary choices. Would it show similar or much lower accuracy if the choice options were more than two, e.g. a 7-point likert scale?

HongzhangXie commented 1 month ago

In the study "Quantifying social organization and political polarization in online platforms," the authors employ the word2vecf model to treat communities as words and instances of user comments within these communities as word-context pairs, thereby validating the communities' scores and identifying those with partisan political tendencies.

In this analysis, we define whether a community is partisan based on the activities of its users. Subsequently, we analyze the political polarization of users based on their activities within these partisan communities. That is, the nature of the community is defined by the activities of its users, and in turn, the nature of the community defines the users' participation in partisan activities. Does this methodology has some unexpected issues to the results?

uc-diamon commented 1 month ago

In regards to Disrupted Routines Anticipate Musical Exploration, how can there be more positive disruptions (versus negative disruptions like COVID) to foster more openness to new experiences?

risakogit commented 1 month ago

"Disrupted routines anticipate musical exploration" How generalizable are the findings, given that the research utilized data from only nine countries? Would a broader dataset, with less data from each country but including more countries, lead to a more robust analysis?

anzhichen1999 commented 1 month ago

Question on DeepWalk: Online Learning of Social Representations: is that still widely used on 2024?

CYL24 commented 1 month ago

I would like to recommend the article "A Comprehensive Survey on Deep Graph Representation Learning." This article provides a comprehensive review of up-to-date deep graph representation learning algorithms, which could be very helpful for readers to gain a basic comprehension of each algorithms (detailed description, characteristics, and limitations) along with their real-world practical applications and potential research directions in various domain such as social analysis, drug discovery, recommender system, and traffic analysis. These information would also be useful when the reader would like to make an overview comparison.

At first, the article explains why we need deep graph representation learning by briefly reviewing and comparing three traditional graph embedding methods (matrix factorization-based methods, random walks-based methods and other non-GNN deep methods). Given that graph-structured data is usually high-dimensional, complex, and irregular, deep graph representation learning is significant because of its ability to capture topological properties and feature attributes of graphs while ensuring scalability across diverse datasets.

Then the article summarizes the ways of GNN architectures including graph convolutions, graph kernel neural networks, graph pooling, and graph transformer. Based on the different training objectives, the article presents three types of the most recent advanced learning paradigms: supervised/semi-supervised learning on graphs (to achieve better performance under the label scarcity but performance still unsatisfactory in graph-level representation learning/ unbalanced datasets/ potential domain shifts), graph self-supervised learning (better generalization ability and robustness while decreasing reliance on labels compared to semi-supervised, but theoretical basis is not solid), and graph structure learning (used for more robust graph representation against adversarial attacks, but optimization of this is difficult and performance not satisfying).

After that the article provides several promising applications in diverse domain to demonstrate the effectiveness of deep graph representation learning. For social analysis, the article summaries three social networks: academic social network (applications: classification/clustering, relationship prediction, recommendation), social media network (applications: anomaly detection, sentiment analysis, influence analysis), and location-based social network (applications: POI recommendation, urban computing).

In summary, this article could serve as a basic recap of up-to-date advancements in deep graph representation learning in recent years and have a basic overview comparison along with several practical applications of them. It might be inspiring for someone in finding a clearer path/ direction for the final project.

Marugannwg commented 1 month ago

Thanks for sharing a lot of interesting ideas beyond the recommended readings in the course material~ As I skim through those paper, I found that networks can be much more flexible and versatile than I thought -- as long as we have some structured data (vertice) at scale and some link/traffic between them, we seem to be able to apply some deep neural learner on the structure.

I wonder if any space we constructed (e.g. we've seen embedding of research paper, researchers; music; area... ) can be think in a network manner, especially if we are interested in the relationship between each vertices?

mingxuan-he commented 1 month ago

Regarding "Quantifying social organization and political polarization in online platforms": Since many social platforms introduce some level of randomness in their recommendation/content ranking algorithms, is it possible to quantify the effect of random sampling on "breaking barriers" across political organizations? How does the content structure e.g. community-based like Reddit vs freeform like Tiktock impact polarization?

erikaz1 commented 1 month ago

Inspired by the DeepWalk article by Perozzi et al. (2014): Would DeepWalk models perform as well if some proportion of the random walks had varying lengths at various starting vertices, so that some walks can traverse farther while the majority remain at some consistent length t? Also, are most network structures created by non-random walks and data?

hantaoxiao commented 1 month ago

The DeepWalk model is particularly notable for its adaptability and scalability, attributes crucial for handling dynamic and large-scale networks like social media platforms. Its ability to incrementally incorporate changes in the network without a complete re-training makes it a promising tool for real-time network analysis. How does the DeepWalk approach to learning social representations compare with other machine learning methods in terms of scalability and adaptability to changes in network topology?

HamsterradYC commented 1 month ago

I’d like to recommend the Quantifying the spatial homogeneity of urban road networks via graph neural networks , article by Jiawei Xue and colleagues presents a novel approach to analyzing urban road networks using graph neural networks (GNNs). The authors introduce a method to quantify the spatial homogeneity of urban road networks, which measures the similarity of topological signatures within and between cities. By applying this method to over 11,000 urban road networks across 30 global cities, they uncover correlations between network homogeneity and socioeconomic indicators such as GDP and population growth. The findings suggest that more homogeneously planned cities are often those with higher GDP and more controlled population growth.

This methodology use of GNNs can be particularly effective in analyzing large-scale spatial data to uncover patterns that are not apparent through traditional statistical methods. For example, this approach could be adapted to study how urban planning influences social outcomes like economic mobility, health disparities, and access to services.

To pilot a study using this method, one could focus on the relationship between road network homogeneity and public health outcomes. The hypothesis would be that cities with higher spatial homogeneity have better public health outcomes due to more efficient access to healthcare services and recreational areas, contributing to lower stress and healthier lifestyles. Social data for this study could include health metrics such as rates of chronic diseases, hospital access, and average health expenditures, juxtaposed against measures of road network homogeneity.

The necessary data for this pilot could be sourced from several databases:

Road Network Data: OpenStreetMap provides extensive global road network data that can be used to calculate spatial homogeneity. Health Data: World Health Organization (WHO) and Centers for Disease Control and Prevention (CDC) websites offer comprehensive health-related data by city or region. Socioeconomic Data: The World Bank and national statistics databases often contain GDP, population growth, and other economic indicators.

Quantifying social organization and political polarization in online platforms,” Paper employed neural embedding techniques to analyze community structures and political polarization on Reddit. While this approach, based on aggregated behavioral patterns, reveals relationships and trends among communities, it might have limitations in capturing individual behavior changes and micro-dynamics. How did you address and measure the potential oversight of individual user behavior diversity and complexity in constructing your model? Additionally, how robust is this method in handling data noise and non-standard user behaviors?

XueweiLi1027 commented 1 month ago

For this week's reading, I recommend The Evolution of Political Memes: Detecting and Characterizing Internet Memes with Multi-modal Deep Learning

Reflection The article by Beskow, Kumar, and Carley presents a pioneering study on the detection and evolution of internet memes, particularly within the context of political discourse. The authors introduce "Meme-Hunter," a multi-modal deep learning model designed to classify images as memes or non-memes. The research leverages image similarity, meme-specific optical character recognition (OCR), and face detection to analyze meme families shared on Twitter during the 2018 US Mid-term elections and the 2018 Swedish National Elections. The study confirms Richard Dawkins' concept of meme evolution, demonstrating how memes propagate and evolve through a process of mutation and inheritance, often influencing cultural and societal biases.

The methodology employed in this research can extend social science analysis in understanding the role of digital artifacts in shaping public opinion and political behavior. By using deep learning to classify and cluster memes, social scientists can gain insights into the diffusion of ideas and cultural phenomena in the digital age. This approach can be used to analyze the spread of misinformation, the impact of political campaigns, and the formation of echo chambers online. Furthermore, the detection of memes can help in studying the dynamics of social movements and the role of humor in political communication.

To pilot the use of Meme-Hunter in a social science context, I could image researchers focusing on a specific case study, such as political elections in any country. The social data for this pilot could include: Twitter Data - Collect tweets from the election period that mention political figures or hashtags related to the election. This data can be obtained through Twitter's API, ensuring compliance with data usage policies. Reddit Data - Gather posts from subreddits dedicated to political discussions, as Reddit often serves as a source for political memes that later appear on other platforms. Image and Text Data - Extract images and associated text from the collected tweets and Reddit posts. The text data will be used for OCR and to train the text classifier, while the images will be used to train the CNN-based model.

The implementation would involve training the Meme-Hunter model on a subset of this data, classifying the collected images, and then using graph learning techniques to map the evolutionary tree of meme families. This pilot could reveal the most popular and influential memes, how they spread, and their potential impact on public opinion during the election period.

Question How did the authors ensure the robustness of their multi-modal deep learning model, Meme-Hunter, in classifying memes across various political contexts and cultural nuances, given the inherent complexity and subjectivity involved in meme interpretation? The paper discusses a model that classifies internet memes, which are not only culturally diverse but also subject to individual interpretation. Memes can have different meanings in different cultural and political contexts, and what is considered a meme in one setting might not be universally recognized as such. It is not hard to see that the authors' approach to training and validating their model, including how they handled the potential for bias and the diversity of meme types, is critical for the model's effectiveness and generalizability.

hantaoxiao commented 1 month ago

Auto-Encoders and Network Learning: How does the ability of H-GRAM to adapt to changes in data structures compare to traditional auto-encoder approaches? Could H-GRAM potentially offer better performance metrics or more flexibility in practical scenarios such as real-time social media analytics? Impact of Learning Approaches on Social Data: Considering the inductive-transductive learning method discussed, how might this approach enhance the accuracy and efficiency of real-time link prediction in social media platforms, where the data is continually evolving and expanding?

00ikaros commented 2 weeks ago

What is DeepWalk, and how does it learn latent representations of vertices in a network? How does DeepWalk generalize advancements in language modeling to graphs, and what role do truncated random walks play in this process? Additionally, how does DeepWalk perform in multi-label network classification tasks for social networks like BlogCatalog, Flickr, and YouTube, particularly in comparison to challenging baselines with a global view of the network? What are the advantages of DeepWalk in terms of handling sparse labeled data and scalability for real-world applications such as network classification and anomaly detection?

beilrz commented 1 week ago

For deep walk and graph embedding in general, one question/concern I have is how does such approach generalize to new nodes? For NLP task, as long as the dataset is big enough, we should rarely observe out-of-vocabulary tokens, and we can even transform these novel tokens into known tokens using sub-words. However, for graph embedding, where each node is treated uniquely, it would be difficult to generalize to novel graph, is there someway to improve this by leveraging NLP techniques?

icarlous commented 1 week ago

In “Quantifying Social Organization and Political Polarization in Online Platforms”: Can we measure how randomness in recommendation algorithms influences political barrier-breaking? How do structured communities (Reddit) vs. freeform platforms (TikTok) affect polarization?

Carolineyx commented 1 week ago

For this week, I would like to recommend "Rethink data-driven human behavior prediction: A Psychology-powered Explainable Neural Network" Summary:

The article introduces the Psychology-powered Explainable Neural Network (PEN). This innovative framework integrates psychological factors into data-driven models to enhance the prediction and explanation of human behavior. Traditional data-driven models often overlook deeper psychological aspects, limiting their accuracy and explainability. PEN addresses this gap by recovering latent psychological features from historical behavior data and using multi-task optimization to improve model performance. The study demonstrates PEN's superiority over existing models in terms of accuracy and generalizability across various evaluation protocols, highlighting its potential in scenarios where psychological data is typically unavailable.

Extending Social Science Analysis:

The PEN framework described in the article can significantly extend social science analysis, particularly in understanding complex social interactions and behaviors. By integrating psychological factors into predictive models, PEN can be used to study the dynamics of social networks more accurately. For instance, in embedding social networks, PEN can enhance tasks such as link prediction and community detection by incorporating latent psychological states. This approach aligns with the exploration of network properties and the enhancement of association predictions through deep neural networks, as discussed in the weekly topics.

Moreover, PEN's ability to explain the underlying psychological mechanisms driving behavior can be particularly useful in studying phenomena like political polarization, social influence, and communication patterns on online platforms. By applying PEN to social media data, researchers can gain deeper insights into the psychological drivers of online interactions, contributing to a more comprehensive understanding of social dynamics.

Pilot Use of Social Data:

To pilot the use of PEN in extending social science analysis, I would propose a study focusing on predicting user engagement and behavior on social media platforms, such as Twitter and Reddit. The social data required would include:

User Demographics: Age, gender, location, and occupation of users. Interaction Data: Posts, comments, likes, retweets, and replies. Content Data: Textual content of posts, including hashtags, keywords, and sentiment analysis. Network Data: Connections between users, such as follower-following relationships and engagement metrics. Psychological Data: User attitudes, emotions expressed in posts, and interaction patterns (collected through text analysis and sentiment analysis).

By inputting this data into the PEN framework, we can generate predictions on user engagement, such as the likelihood of users participating in discussions, sharing content, or forming new connections. These predictions can help identify key psychological factors influencing user behavior, providing valuable insights for improving user experience and platform management. Additionally, the explainable nature of PEN allows for a better understanding of how psychological factors interact with social and contextual variables, offering a more holistic view of online social behavior.

Brian-W00 commented 1 week ago

The article "DeepWalk: Online Learning of Social Representations" explores how to use neural network methods to embed social graphs. What specific technical challenges and limitations does DeepWalk face when processing large-scale dynamic social network data?

MarkValadez commented 1 week ago

As I read about and worked with the deep walk algorithm, it seemed that, while it is able to create some random ego-subgraph from the main graph G, it does not truly capture the structural components of the graph which are often of interest. I found label propagation or Levenshtain community algorithm better at yealding interpretable components of the Graph. On another note however, the deep walk algorithm seems useful for block model analysis in which we consider clics separated by some structural barrier. At the same time, the idea of simulating sampling error though deep walk allows one to potentially analyze the invariance of a network graph and statistical properties, under different sampling methods.