Thinking-with-Deep-Learning-Spring-2024 / Readings-Responses

You can post your reading responses in this repository.
0 stars 1 forks source link

Week 3. Apr. 5: Sampling, Bias, and Causal Inference with Deep Learning - Orienting #5

Open JunsolKim opened 3 months ago

JunsolKim commented 3 months ago

Post your questions here about: “The Datome - Finding, Wrangling and Encoding Everything as Data”, “When Big Data is Too Small - Sampling, Crowd-Sourcing and Bias” & Thinking with Deep Learning, chapters 5,7; and “Deep learning for causal inference”, Bernard Koch, Tim Sainburg, Pablo Geraldo, Jiang Song, Yizhou Sun, and Jacob G. Foster.

maddiehealy commented 3 months ago

We have done some work on reducing the dimensionality of neural networks before, so this part of chapter 5 caught my attention: "text data are high-dimensional in that the meaning underlying each word, as a string of characters, is the same or different largely not based on components within the word itself (e.g, the distribution or ordering of letters), but the underlying definition."

On a practical level, what is happening when we reduce the dimensionality of text data? Are we essentially losing entire alternative definitions of these words?

guanhongliu2000 commented 3 months ago

In the “The Datome - Finding, Wrangling and Encoding Everything as Data”, I want to know something about the application of Deep Learning Models in social network analysis. Typically, given the complexity and dynamic nature of social networks, what are the limitations of using graph neural networks for analyzing social structures, and how might these limitations affect the interpretation of social ties, influence, and community detection?

kceeyang commented 3 months ago

In Chapter 7, a data augmentation approach called Mixup was introduced. This technique is used to reduce overfitting by combining random pairs of training features with associated labels. I was wondering, can this method be applied to any type of structured prediction problem? Would it maybe work better on one type of learning problem over the other? In addition, is any aspect of this technique also helpful for training in an unlabeled setting?

XueweiLi1027 commented 3 months ago

I was still very confused about the "human-in-the-loop"concept after reading relevant session in Chapter 7. How is the annotated data different from every other regular data? What is the annotation that could help "compensate for biases and direct the ongoing human annotation process" in the process of model building? I need to know more about what annotators do to a dataset, at which stage of the model building pipeline, and what exactly do they annotate before getting a grasp on the subsequent human annotator sampling process.

Pei0504 commented 3 months ago

Chapter 5, "The Datome - Finding, Wrangling and Encoding Everything as Data," talks about how deep learning sees and changes all types of data. It tells us how these models can take simple data, like words or pictures, and make it into a form that's easier to use. This helps the model make sense of big, complex sets of data.

The chapter also explores how changing the way we show data to the model can affect what it learns. It introduces an idea called transfer learning, where using what a model learned from one task can help it do better on a different task. Considering the main points of Chapter 5, How do deep learning models convert sparse data into a more compact form, and why is this important for understanding complex information? And how does the way we represent data (using simple raw data or more processed features) influence what a deep learning model can learn from that data?

hantaoxiao commented 3 months ago

The chapter emphasizes how neural networks and deep learning frameworks push us to view virtually all forms of information—text, images, audio, social networks, and more—as data. This perspective raises profound interdisciplinary questions about the implications for fields beyond computer science, such as ethics, sociology, and the humanities. How does the quantification and algorithmic processing of diverse forms of human expression and social interaction affect our understanding of human experience and societal structures? Are there aspects of human culture and social life that resist being fully captured or understood through the lens of data, and what might we lose in the attempt to encode everything as data?

chenhuifei01 commented 3 months ago

The document, "The Datome - Finding, Wrangling and Encoding Everything as Data", highlights how neural networks encourage us to view everything as data, encompassing a wide array of domains from text and images to audio, graphs, and beyond. This approach necessitates abstracting complex entities into data representations, which inherently involves choosing what aspects to emphasize and what to omit. Consider the balance between the utility of simplification in making data computable and the risk of losing critical nuances of the original entities. How should data scientists navigate these trade-offs to ensure responsible and meaningful analysis?

Yu-TingWeng commented 3 months ago

"The Datome - Finding, Wrangling and Encoding Everything as Data," suggests different approaches for inputting sparse and dense data into models. When evaluating these approaches, how should we consider their impact in terms of computational efficiency and model performance?

Xtzj2333 commented 3 months ago

Chapter 5: The Datome - Finding, Wrangling and Encoding Everything as Data.

This chapter mentions many uses of dimension-reduction techniques. I wonder why autoencoders are not being used? What's the difference between dimension-reduction and autoencoders, apart from linearity and non-linearity?

uc-diamon commented 3 months ago

Chapter 7: When Big Data is Too Small - Sampling, Crowd-Sourcing and Bias - Will human labeling always be the bottleneck for supervised learning?

risakogit commented 3 months ago

Chapter 5: The Datome - Finding, Wrangling and Encoding Everything as Data.

The chapter talks about a dictionary that is used to, for instance, map a word to its root lemma or map each word from text to its meaning. How is this dictionary created? I believe the quality of the dictionary is vital.

Marugannwg commented 3 months ago

What I found most interesting after reading those chapters is that --- seems like (almost) everything can serve as data for deep learning with proper care... I really want to see some extra examples on how to transform an unusual thing into matrix and send them into various neural architectures. (e.g. what if I want to study human relationships/networks --- as long as we have many sample networks, they can be represented as matrice and tackled, right?)

HongzhangXie commented 3 months ago

I am interested in the discussion about S-Learners, T-Learners, TARNet, and Dragonnet in the fourth part of "Deep Learning for Causal Inference", titled "Three Different Approaches to Deep Causal Estimation". S-Learners estimate the treatment effect by building a single predictive model, whereas T-Learners do so by constructing two independent predictive models. One model predicts the outcome for individuals who received the treatment, and another predicts the outcome for those who did not. TARNet extends the T-learner by incorporating shared representation layers, and Dragonnet further adds a propensity score head to TARNet. As the complexity of these methods increases incrementally, I am curious whether more complex models would increase the risk of overfitting. Or, considering that in causal inference we aim to estimate a conditional average treatment effect (CATE) averaged out, does the issue of overfitting not significantly affect the conclusions drawn from the models?

erikaz1 commented 3 months ago

What kinds of sampling methods are used, if any, if a model can be fitted with all existing knowledge from a particular source (e.g. wiki pages)?

anzhichen1999 commented 3 months ago

How does the empirical performance of Dragonnet, in terms of bias and variance in estimating ATE, compare to traditional machine learning approaches to causal inference, and what implications does this have for the choice of algorithms in practical applications?

MarkValadez commented 3 months ago

I have questions about the graph representation section and structural analysis in Chapter 5: The Datome - Finding, Wrangling, and Encoding Everything as Data. 1) What are the trade-offs between sparse and dense graph representations regarding memory usage, computational efficiency, and effectiveness in capturing complex network structures? 2) In what ways do higher-order network motifs, such as tetrads and cliques, contribute to our understanding of network dynamics and function? How can we efficiently identify and analyze these motifs in large-scale networks?

kangyic commented 3 months ago

I have questions regarding sampling.

  1. In oversampling, we simply repeat the cases in minority group. When there are problems in the data, say the we missed many cases that can lead to be classified in this group, wouldn't it actually augment the problem in the data too? I assume it's highly possible cause there are not many observations those groups actually need to be oversampled. 2.So in data augmentation, we use rescaled and refined samples. If this is in linear regression, the rescaled data are just the same as those non transformed data, in any ways, why would it be any different compared with just repeat it?
HamsterradYC commented 3 months ago

In the pursuit of robustness and generalization, to what extent should models be exposed to noisy and augmented data, and could there be a point where models become too generalized, losing their ability to make precise predictions in specific contexts?

La5zY commented 1 month ago

Given the discussion in Chapter 7 about various sampling strategies to address imbalances in datasets, how can these sampling techniques specifically influence the ethical outcomes of deep learning models, particularly in scenarios involving sensitive or biased data?

beilrz commented 1 month ago

After reading this chapter, my question is, if we reduce everything to data, how can we be sure to preserve the meaningful relationship in the data, assuming the conversion process is lossy? For example, if we convert audio data text, we may lose information about pitches or tones, which could be important for social science-related questions. I feel this is an area that wants require a significant amount of pre-existing knowledge.

00ikaros commented 1 month ago

What are the main advantages of neural networks in transforming sparse data into dense representations, and how does this capability enhance the ability to reconstruct, generalize, and learn intrinsic structures from data? Additionally, how do dimension-reduction techniques assist when computational constraints or data scarcity are issues, and what are some practical applications across various domains?

Carolineyx commented 1 month ago

chapter 5: What are the comparative advantages and disadvantages of using raw low-level data versus processed high-level data in deep learning models? How do different data representation techniques impact the model's ability to learn and generalize, especially when dealing with complex, high-dimensional data?

chapter 7: Could you elaborate on the effectiveness of different resampling techniques (e.g., undersampling, oversampling, and negative sampling) in addressing class imbalance? What are the key factors to consider when choosing a resampling strategy to ensure robust model performance and minimize potential biases?

Deep Learning for Causal Inference: How do deep learning models handle the inherent complexity and potential confounding factors in causal inference tasks compared to traditional statistical methods? What advancements in model architecture or training techniques are necessary to improve the reliability and interpretability of causal inferences made by deep learning models?

icarlous commented 1 month ago

The chapter highlights that neural networks and deep learning frameworks encourage viewing all information—text, images, audio, social networks—as data. How does this data-centric approach affect our understanding of human experience and societal structures, and what might we lose in trying to encode everything as data?

Brian-W00 commented 1 month ago

When applying deep learning to predict future events or trends, how is the model's adaptability and sensitivity to future uncertainty handled and evaluated?