UChicago-CCA-2021 / Readings-Responses

1 stars 0 forks source link

Exploring Semantic Spaces - Fundamentals #32

Open HyunkuKwon opened 3 years ago

HyunkuKwon commented 3 years ago

Post questions here for one or more of our fundamentals readings:

Jurafsky, Daniel and James H. Martin. 2015. Speech and Language Processing. Chapters 15-16 (“Vector Semantics”, “Semantics with Dense Vectors”)

Raychanan commented 3 years ago

My question is non-technical. According to the text, human languages have a wide variety of features that are used to convey meaning. And just like the paper included in this week, Caliskan and colleagues found that semantics derived automatically from language corpora contain human-like biases. So I’m curious if we have some techniques to eliminate social biases when constructing our logical representations of sentence meaning? I’m afraid the social biases embedded in human expression would aggravate the social imbalance with the wide use of automatical techniques in the near future.

jacyanthis commented 3 years ago

Is there an established way to run word2vec or another embedding algorithm with attention weights? i.e. where the context words are given unequal weights depending on their relevance. I see Sonkar et al 2020 attempt this, but I don't know how viable their approach is and whether it's coded up so we can use it (e.g. in Gensim), or whether it really makes a difference in the semantic space we build.

xxicheng commented 3 years ago

Can multilayer perceptron networks be used in unsupervised learning?

romanticmonkey commented 3 years ago

I'd like to ask here your opinion on semantic parsing. Many think that semantic parsing is a bulky approach, much less efficient than neural networks, but it is indeed intuitive to human reading. Do you think it's an approach worth investigating on?

jinfei1125 commented 3 years ago

How can we choose the best loss functions? If I understand your first lecture this week correctly, you mentioned some are good at understanding the texts, and others are good at predictions. Is there any rule of thumb when choosing loss functions?

k-partha commented 3 years ago

Has any CSS research used dynamic contextual embeddings (from an attention-based transformer model) recently? Given the superior performance of BERT and transformer-based models on 'meaning' related benchmarks, I'm wondering what exciting new CSS research paradigms these dynamic embeddings can power.

sabinahartnett commented 3 years ago

I understand the foundations and sentence level functionality of semantic parsing as is explained by this piece - but I was hoping to hear a bit more about the types of texts in which it functions best and is most often implemented - could you elaborate on some examples where semantic parsing is most often used and most often useful (i.e what types of differences might we see when texts are more or less correlated? More linguistically consistent?)?

ming-cui commented 3 years ago

As the quote of Zhuangzi is originally in Chinese, I am wondering if the embedding models are applicable for Chinese, given that the concepts introduced in Chapter 6 do not work in Chinese. Chinese characters can either express meanings alone or in combination, and there are usually no spaces in a sentence.

MOTOKU666 commented 3 years ago

I'm wondering whether tense logic would be easier to deal with in other languages. In particular, "temporal expressions in English are frequently expressed in spatial terms, as is illustrated by the various uses of at, in, somewhere, and near in these examples".

Rui-echo-Pan commented 3 years ago

The problem of bias and embeddings appear again, which is an important topic in one of the exemplary readings. What's the practical problem of it in the analysis, and what are techniques to reduce the bias when using embedding?

zshibing1 commented 3 years ago

In “Vector Semantics”, why can we assume that "battle", "good", "fool", "wit" are orthogonal to each other and assign them four distinctive dimensions, as in figure 6.2?

Bin-ary-Li commented 3 years ago

Is there any literature that examines the qualitative difference between embedding generated by, said, LSA using 300 SVD components and a learned feedforward neural network representation layer?

hesongrun commented 3 years ago

The word2vec technique is awesome! The co-occurrence information brings in much richer second-order information of texts! I have a question about the assessment of word-embeddings. What are some common metrics to assess the effectiveness of the embeddings learnt?

william-wei-zhu commented 3 years ago

Extending from @ming-cui , I wonder if the usefulness embedding models is stable across other forms of languages.

theoevans1 commented 3 years ago

Chapter 6 discusses attempts at debiasing, but notes that bias can not be eliminated. Are there particular forms of bias that are especially difficult to reduce? What kinds of consequences can this have in applications of these methods?

RobertoBarrosoLuque commented 3 years ago

If possible could we go more in depth into how word embedding are trained? Say we want to compare how two actors use the same term, should we have a corpus for each then train a wordembedding for each and find the closest words to our term of interest?

Also, using word embeddings is more memory efficient than using tf-idf or count vectors since, unlike the other two, embeddings are dense, yet they seem more computationally inefficient since they require learning complex weights through stochastic gradient descent. How should we weigh these trade-offs when thinking about which word representations to use?

mingtao-gao commented 3 years ago

For this week’s reading, I noticed that it is necessary to generate training data for the transition-based dependency parsing. I wonder what is the “appropriate” size of the training set in order to obtain a reliable model? Would this algorithm be robust if we are not able to provide enough information?

egemenpamukcu commented 3 years ago

I would like to hear more about the unsupervised vs supervied neurat nets and their different uses in text analysis. Also is there a reason, say for a classification problem, why a researcher would choose to use logistic regression instead of a neural network besides interpretability? And it would be great if you could talk more about the word2vec representation of words and how its results are being processed by neural networks.

lilygrier commented 3 years ago

Given the extra computational effort required, in what contexts do word embeddings provide insight above and beyond more straightforward analyses of co-occurrences through looking at k-grams? Are there cases where it actually makes more sense not to use a mathematically intense operation such as word2vec?