Counting Words & Phrases - Jurafsky & Martin 2017

jamesallenevans commented 4 years ago

Post questions here for:

Jurafsky, Daniel & James H. Martin. 2017 (3rd Edition). Speech and Language Processing. Singapore: Pearson Education, Inc.: Chapter 18 (“Information Extraction”): 739-778.

lkcao commented 4 years ago

on page 14, the author introduced a method to avoid semantic drift: assigning confidence values to the tuples. I am a little confused, because this method is based on a numerical basis, and I do not know how can it rule out the drifting cases. For example, if we match <Sydney;CircularQuayi> mistakely in the documents and this mistake occurs too many times, should we measure the number of "hits" by human counting?.And if we decide the confidence interval is too low, should we give up the pattern p all together, or we just give up the mistaken tuples?

heathercchen commented 4 years ago

When extract relationships from entities, the methods described in this chapter focus on finding verbs that link the two entities of interest and neglect adjectives or adverbs that modify what kind of relationship it is. My question is will the ignorance of adjectives and adverbs here influence the direction or deeper emotional linkages between entities?

sanittawan commented 4 years ago

On page 16-17 section 18.2.5, the chapter talks about an unsupervised method for extracting relations from texts and gives an example of the ReVerb system. The rule is that the relation will only be accepted if it meets syntactic and lexical constraints and the latter constraints are based on a dictionary. What exactly does this dictionary contain? How do we construct it?

di-Tong commented 4 years ago

This chapter is a very useful toolkit! I wonder if the all relation extraction processes distinguish between the two entities involved. While the Resource Description Framework uses a subject-predicate-object expression, it seems that under a lot of circumstances, the direction of the relationship between the two entities is not considered or included in the relation extraction process (maybe I'm wrong about this point). Since the extracted relation is usually used for know graph related tasks, the direction of the relationship seems to be important information that need to be accurately extracted.

ckoerner648 commented 4 years ago

Jurafsky and Martin 2017 describe how we can train computers to recognize named entities, relations, events, and temporal expressions. The authors make the example of airlines raising their fares: They describe how to extract the different airline names and the amount to which they increase their fares from a corpus of newspaper articles. I’m confident, that this method overall has good results. My question addresses the semantic details: I’m curious if it is possible to categorize the following information correctly: “United Airlines did the opposite of its competitor who raised the fares by $6.”

wunicoleshuhui commented 4 years ago

I'm fascinated by the information on temporal information extraction. Having experienced some of the difficulty with extracting time data and converting it to time series data that can be used in regression, I'm wondering, are there machine learning approaches that can take in all potential date formats and return a full set of complete time series data?

Lizfeng commented 4 years ago

I think this article is a great review of the ML methods we have learned in week 3 homework. To extract relation from texts, we could use supervised-learning, semisupervised learning and unsupervised learning. One concern I have is about the bootstrapping method used in semisupervised learning. The average bootstrap sample omits 36.8% of the data. In a textual database, this could lead us to omit important relationships.

katykoenig commented 4 years ago

The chapter states that NER uses sequence classifiers to determine entities in a text as many entities span multiple words. I am curious about the mechanics of sequence classifiers. Aside from gazetteers, name-lists or the like, do sequence classifiers analyze every n-gram for every n=1, 2, 3... and judge collocations?

deblnia commented 4 years ago

As @ckoerner648 notes, the methods outlined in this reading seem to be fool-able. What do we do with the confidence interval in cases like this? Do we check with human counting? Or as @clk16 suggests, do we throw the pattern out altogether?

HaoxuanXu commented 4 years ago

It's interesting to see how transferable the NER is in terms of extracting relations for documents that utilize different semantic styles. I can imagine it working well of the document follows a structured format like WSJ articles.

alakira commented 4 years ago

This text is thorough and very useful for our research! But I wonder the drawback of this step-by-step approach (name -> relation -> event). If we misidentify the word in the first step, then the following inference on the sentence would be wrong. Is there any method to check the overall certainty of the sentence including each word accuracy?

cindychu commented 4 years ago

In this chapter, many advanced techniques are introduced for ‘formatted’ information extraction of higher knowledge--name, identities, relations. For example, for named entity recognition (NER), feature-based machine learning, neural network algorithm are introduced; and also a rule-based method is mentioned, which is more agile but requires previous knowledge. However, these methods are very complex to train the identification model from the very beginning.

Therefore, I am wondering, in the real world application, what are the most feasible ways for us to identify names and identities in our text and how to transfer the previous models to our specific domain and dataset easily?

bjcliang-uchi commented 4 years ago

While the article starts by introducing the 5-6 basic emotions, in practice it seems that the authors focus more on positive and negative implications. I am wondering how NLP may integrate that fancy and perhaps also more fundamental Plutchik wheel of emotion into related research.

luisesanmartin commented 4 years ago

The article provides a clear summary of feature extraction from texts that goes further from regular expressions-based approaches. I'm wondering if this approach could be extended to extracting information from texts other than news articles, like contracts, judgments, academic articles, or others.

sunying2018 commented 4 years ago

I have a question about the evaluation of named entity recognition. As mentioned in this article, we can use some familiar metrics such as recall, precision and F1 score to measure the performance of NER. However, all of these metrics need the true labels and predicted labels. How should we get the true labels if the number of this type of entities is large and it is very difficult for hand codings?

luxin-tian commented 4 years ago

While introducing the methods for temporary normalization for tasks that extract time from texts, the author says that "most current approaches to temporal normalization are rule-based". This chapter gives examples of potential sources of complexity in such tasks. I wonder if it is possible to use any neural algorithm for temporary normalization? As rule-based methods have already involved using the distance from the anchoring date to the nearest unit, can any neural algorithm be possible based on similar measures?

rachel-ker commented 4 years ago

I found this reading really informative about how to begin to extract useful information to answer specific questions. I was wondering if we are trying to capture both the relation as well as the content/description about the entities in a structured manner, would we continue to build classifiers on top of each other as suggested in supervised learning methods? Or would this be done independently in practice? How else could we do more complex extractions?

yirouf commented 4 years ago

This chapter of Speech and Language Processing discussed extracting method with respect to the importance of verb in languages like English. Verbs connect the subject and the object (two things) and explain their relationships with each other (inaction), but I was wonder whether it is appropriate to generalize it to other languages like maybe Japanese in which language the whole grammar structure is different from that of English?

YanjieZhou commented 4 years ago

This article is really useful to me with detailed elaboration of how to extract information from texts, like emotions. Postive and negative emotions are convenient and suffice many analyses, but from a psychological perspective, I am wondering whether it is practical to use some algorithms like MLP to deal with more complex emotions.

laurenjli commented 4 years ago

I'm very interested in the distant supervision method as it combines a number of methods together to create a pretty decent training set. However, the authors mention that it does require a an existing large amount of data to begin with. How often is this method used in practice and what other conditions are necessary for it to be successful?

rkcatipon commented 4 years ago

This chapter was quite helpful! I used to work for a social media data analytics company and we offered named entity recognition as a feature, but I never fully understand what was going on beyond base intuition. In practice, we ran into a few problems with the analytic. 1) the NER corpus had to be updated regularly. For example, "President of the United States" shifted meaning from Obama to Trump. Perhaps there are now better ways of continually updating extraction and recognition? 2) As @yirouf alluded to, it was a challenge using NER on other languages. To be honest, we would just use Google Translate to translate everything into English and then run NER. Looking back on it now, I'm not sure that was the best approach. Perhaps others can share some thoughts on better methodologies for handling non-English language processing?

ziwnchen commented 4 years ago

My question is that when extracting relationships from entities in practice, especially in a very large dataset, how could we evaluate the precision of the results..usually it's rare to get labeled relationship in most corpus but extracting relationship itself is a very basic/frequently used method

vahuja92 commented 4 years ago

I found the methods described in this text very useful! My question is - how computationally feasible are the techniques compared to each other? How should these limitations inform with the technique we should pick?

VivianQian19 commented 4 years ago

I find this chapter on information extraction very useful. The author mentions two common problems in named entity recognition, i.e. ambiguity of segmentation and type ambiguity. The first refers to the difficulty of finding the boundary between the target named entity and the rest in the text; the second refers to the difficulty of finding the exact named entity as distinct from other expressions that refers to the entity in contexts that are not directly relevant. I wonder how these commonly occurring problems can be resolved.

chun-hu commented 4 years ago

This chapter is very informative! I imagine that the extraction of NER and temporal expressions can be very useful in sentiment analysis, but how should we approach that in a practical way?

cytwill commented 4 years ago

This chapter provides lots of useful methods for relation extractions from text data, and I would like to employ them in my research lately. The basis for these relation extractions is NER, and as the authors pointed out sometimes it is difficult to judge what label a word or several words should be attached. Most of the methods in the chapter use pre-defined lists or rules, and these also happen in the detection of relations of entities. So my question is that if there are any explicit approaches that we can do in advance to reduce the demand for annotated data, or to enhance the accuracy of our learning result, perhaps something like iterating the optimal feature space for the next-step learning?

kdaej commented 4 years ago

This paper introduces how to extract time and temporal order information from texts. One of the features to use is the tense of verbs. Although the temporal ordering utilizes the past and the present tense to extract meaning about the order of events, I was wondering if the machine can also make inference about the relative order of events. One sentence may include more than one event and indicate the order of them by using tense relatively. For instance, past perfect tense can be used to describe an event that occurred even before a past event. In this case, can machine learn how to tell which one is past and which one is the even older event?

Computational-Content-Analysis-2020 / Readings-Responses

Counting Words & Phrases - Jurafsky & Martin 2017 #19