CogComp / lbjava

Learning Based Java (LBJava)
http://cogcomp.cs.illinois.edu/page/software_view/LBJava
Other
13 stars 17 forks source link

- add the possibility of using a loss augmented score to the learner #61

Closed kordjamshidi closed 8 years ago

christos-c commented 8 years ago

@kordjamshidi I can't view the diff on github. Why are there "1,469 additions, 1,434 deletions"? Were the changes so massive or did you change the indentation?

kordjamshidi commented 8 years ago

Ah, I see, the reason is that my fork had been isolated from this repo and I needed to fork again and copied the whole file that I had changed from my old fork to the new one. But actually I could see diff I think.

christos-c commented 8 years ago

I'm not sure what that implies. Are you going to resubmit? I cannot merge what I cannot see.

kordjamshidi commented 8 years ago

I don't know, the difference is like when you do reformatting, it is only in spacing, I am not sure why it does not show it.

christos-c commented 8 years ago

It doesn't show it because it's too many changes. So I would suggest you undo the formatting changes and push again.

danyaljj commented 8 years ago

It's because the style change; for your convenience I created this diff based on Paris's change: https://www.diffnow.com/?report=4nbwd

The main change is this

    public ScoreSet scores(Object example) {

        Object[] exampleArray = getExampleArray(example, false);
        ScoreSet resultS = scores((int[])exampleArray[0], (double[])exampleArray[1]);
        if (!lossFlag)
            return resultS;
        else
            return scoresAugmented(example,resultS);
    }

    public ScoreSet scoresAugmented(Object example, ScoreSet resultS ) {
        ScoreSet temp= new ScoreSet();
        Lexicon lLexicon = getLabelLexicon();
        String gold = getLabeler().discreteValue(example);
        for (int i = 0; i < lLexicon.size(); i++)
            if ((lLexicon.lookupKey(i).valueEquals(gold)))
                temp.put(lLexicon.lookupKey(i).getStringValue(), ( (resultS.getScore(lLexicon.lookupKey(i).getStringValue()).score - (1 / (double)(candidates)))));
            else
                temp.put(lLexicon.lookupKey(i).getStringValue(), resultS.getScore(lLexicon.lookupKey(i).getStringValue()).score + (1 / (double)(candidates)));

        return temp;
    }

I'm not very clear why what we are doing in scoresAugmented. Could you clarify please?

PS. We should add auto-formatter to lbjava as well, otherwise we will have more of these issues in future.

kordjamshidi commented 8 years ago

Thanks @danyaljj, for the diff. About the method, it is basically to update the score of each binary variable (label) based on the gold value of each example for that variable. Since I use SparseNetwork to keep the model there is a ltu for each lable, if the gold is same as a specific label then its binary value for that lable is 1 and the score for that lable will be oldscore - lossOffset and otherwise it will be zero and the score will beoldscore+lossoffset. It is like this if you do some simple algebraic operations to simplify the objective that include a hmming loss for each label, you can see the type of objective of the loss-augmented inference in my PhD thesis here, page 173: https://www.researchgate.net/publication/269519103_Structured_Machine_Learning_for_Mapping_Natural_Language_to_Spatial_Ontologies

danyaljj commented 8 years ago

@kordjamshidi could you update this with upstream/master?

kordjamshidi commented 8 years ago

Do you want to merge it?

danyaljj commented 8 years ago

potentially, yes.

kordjamshidi commented 8 years ago

I have had difficulties to upgrade, there are too many edits in upstream and conflicts.

cowchipkid commented 8 years ago

We need to make sure we merge our forks with our upstream sources before we issue PRs. I created this same issue for Mark when I made just a few changes on my fork which was badly out of sync with the rapidly changing IllinoisCogComp/illinois-core-nlp. If this is the issue, see https://help.github.com/articles/syncing-a-fork/ for helpful instructions.

kordjamshidi commented 8 years ago

When I issued my PR everything was up to date but it was not merged until now that the upstream have had a lot of modifications. I guess there should be some priorities for merging according to the timing of the PRs as well as the amount of changes.

christos-c commented 8 years ago

Replaced by #75