Generate data-set containing parser errors: write oracle function at test time

junekihong commented 11 years ago

At every step at test time, output whether or not the parser has made an error, as well as the features that fired.

karuiwu commented 11 years ago

Some small steps I suggested: In the MaltParser source code, first look at when MaltParser parses each of the sentences at test time (1). Then, search for the line(s) of code at which the parser chooses an action for a specific sentence at test time (2).

You will want to write a function that takes in the test sentence before area (1), and perhaps populates a data structure with oracle action(s)/sequences at each step. Then at area (2), you will want to check how the parser actions compare with that of the oracle.

jeisner commented 11 years ago

Note that you will need a batch of sentences with known gold parses in order to generate this dataset. There will actually be several such datasets: for each k=1,2,3 (at least), you want a separate dataset for the task of determining whether there's been an error in the last k steps. Once you've generated each dataset, divide it into training, dev, and test.

It's not wholly clear whether we should collect errors that the parser makes when parsing its own training data, or when parsing unseen data: The former case corresponds more closely to what we currently envision for training a revision-based parser, since our idea is to use SEARN to train all of the actions at once, including error detection and revisions. So maybe it's okay.
However, the latter case seems to be more powerful because it could learn to detect and correct generalization errors (where we make a mistake on unseen data due to a novel word, for example) and even out-of-domain errors. The question is how we would modify SEARN to train a revision-based parser in that case.

I think it's probably best to do the latter case. That means that when you are collecting parser errors on a batch of sentences with known gold parses (e.g., 10% of your gold datset), you should be running a version of those parses that was trained only on the other sentences (e.g., the other 90% of the gold dataset). By jackknifing across 10 batches, you can collect error signals ("error" or "no error") for all positions in all the sentences of your gold corpus.

junekihong commented 11 years ago

I think I found more Conll data. http://conll.cemantix.org/2012/data.html

It has Arabic, English, and Chinese. Although I noticed that the Conll data that Katherine had access to had the actual word on the 4th column. Here, in this data, the 4th column seems to be replaced with "[WORD]".

I think when I find some data to use, I'll put a copy online here: http://juneki.acm.jhu.edu/data/

junekihong commented 11 years ago

Hey guys. Sorry its been a while since an update.

I spent the weekend reading and stepping through MaltParser. And I think I'm closing in on the areas where Maltparser parses sentences and makes decisions. There was a lot of code and sparse javadocs so it wasn't as easy as I thought it would be.

Im hoping with a little more work tomorrow I'll nail down precisely where the parsing decisions are being made, so that I can write an oracle to place there.

karuiwu commented 11 years ago

We agreed to let me take on this issue after Juneki affirmed that he will be able to pickup the code easily (through comments and an appropriate README). Juneki will continue with the contents of this comment.

junekihong commented 11 years ago

Well. I have found both sections of code that I mentioned earlier in our Thursday meeting.

I had access to the parsing Queue and Stack, and then Katherine that reminded me that the sentence tokens I was looking for should be stored there. I basically said "Oh! That makes this really easy!"

I have MaltParser printing out its parsing Stack and Queue and decision at every step. It should be a snap to put in the Oracle when we have it ready.

jeisner commented 11 years ago

Progress!

On Fri, Mar 29, 2013 at 10:21 PM, Juneki Hong notifications@github.comwrote:

Well. I have found both sections of code that I mentioned earlier in our Thursday meeting.

I had access to the parsing Queue and Stack, and then Katherine that reminded me that the sentence tokens I was looking for should be stored there. I basically said "Oh! That makes this really easy!"

I have MaltParser printing out its parsing Stack and Queue and decision at every step. It should be a snap to put in the Oracle when we have it ready.

— Reply to this email directly or view it on GitHubhttps://github.com/karuiwu/reparse/issues/1#issuecomment-15668198 .

karuiwu commented 11 years ago

Current state and goals for today: Juneki's located the exact place in which MaltParser decides which parser action to choose at every step. He's also made the corresponding CoNLL data/sentence available at the point in time. With minor alterations to my existing code base, which, from simple testing/debugging, prints the correct oracle action at each step, we can seamlessly combine these files with MaltParser's to verify whether or not it prints the correct oracle action at each step.

That being said, there might be some unforeseen in-between steps required, such as making the goldparse visible in addition to the rest of the CoNLL data (whether or not the goldparse will be cut-off from MaltParser is not yet known since the parser is currently being run on test data). We may also opt to refactor our code such that we solely use the internal buffer and stack of MaltParser as opposed to keeping another copy within the OracleTest class. Additionally, we may need to go on one final excursion into MaltParser to find where it stores its feature weights at every step, and make them visible to the OracleTest class (depending on how relevant the features are).

The end goal today is to have MaltParser output whether or not it has made an error at every parsing step of every sentence (essentially this issue!). Given that the rest of this issue is straight-forward to complete, we will also aim to build a simple set of features for the MegaM classifier to use. Issue #3 will be our main target tomorrow.

jeisner commented 11 years ago

Thanks. Questions:

You say "correct oracle action." Aren't there multiple oracle actions in some case?
You say the parser is currently being run on test data. Shouldn't the test data be sealed in an envelope till the very end of the project?? Or do you have another sealed test set? You shouldn't look at the test data!
Not sure what you mean by "from simple testing/debugging."
Not sure what you mean by "making the goldparse visible."

karuiwu commented 11 years ago

You say "correct oracle action." Aren't there multiple oracle actions in some case?
Yes, it should have been "correct oracle action(s)"
You say the parser is currently being run on test data. Shouldn't the test data be sealed in an envelope till the very end of the project?? Or do you have another sealed test set? You shouldn't look at the test data!
Ah when I write test data, what I mean is data that isn't accompanied with any gold parses, which is what MaltParser was testing on before. Now, we are testing on data that is accompanied with gold parses, and we aren't peeking a bit. :)
Not sure what you mean by "from simple testing/debugging."
I didn't test/debug the cost functions/oracle class as thoroughly as I wanted, so I'm not completely confident about the validity of the oracle outputs. At this point, I could have debugged more rather than combine the two code bases (MaltParser and the oracle) together, however I thought being able to compare the oracle outputs with what MaltParser outputs will further help me debug, so I'm choosing to wait until then.
Not sure what you mean by "making the goldparse visible."
MaltParser stores the CoNLL data in a hierarchy of classes, and I made a guess that since the goldparse isn't necessary during test time (for obvious reasons), MaltParser will have no need to store the goldparse in its classes, and that's actually exactly what happened - when I tried getting the CoNLL data from MaltParser's internal class, the gold-parse wasn't being stored. So I made MaltParser store the gold-parse in those classes, and added a few functions to the classes to be able to get them (definitely didn't want to make a static class/global variable or another work-around; wanted clean code!)

jeisner commented 11 years ago

Awesome, very clear, thanks. See you tomorrow!

karuiwu / reparse

Generate data-set containing parser errors: write oracle function at test time #1