HazyResearch / dd-genomics

The Genomics DeepDive project
Apache License 2.0
11 stars 6 forks source link

The gp relations extracted should have a high gene expectation #333

Open ThomasPalomares opened 8 years ago

ThomasPalomares commented 8 years ago

Currently, when looking at our final results for precision computation or other purposes, we add only the condition gp.expectation >0.9, while many of them have a low gene.expectation (on the full document,400 gp relations with gp expectation above 0.9 with gene expectation below 0.9; 60 gp relations with gene expectation below 0.5)

ThomasPalomares commented 8 years ago

I went through 20 of the 60 "true gp - false g", 19 of them were actually false, and for all of them, either the gp relation is extracted (with another true mention of gene in the sentence) or there is nothing to extract. More details: https://docs.google.com/spreadsheets/d/1I4oU-PRvsem1Rb4ePWilKOwYRyZtdfswYwTYxkG2A_I/edit?usp=sharing

ThomasPalomares commented 8 years ago

More analysis should be done to find the correct threshold of expectation for gene_mention to take

ajratner commented 8 years ago

This might be some evidence that we should experiment w adding the joint factors back in- maybe start by playing around with that a bit? On Tue, Feb 9, 2016 at 7:20 PM Thomas Palomares notifications@github.com wrote:

More analysis should be done to find the correct threshold of expectation for gene_mention to take

— Reply to this email directly or view it on GitHub https://github.com/HazyResearch/dd-genomics/issues/333#issuecomment-182179727 .

chrismre commented 8 years ago

I would not tune the threshold... just add in more factors and training data =) The threshold is meaningful, not something to tune :)

On Tue, Feb 9, 2016 at 8:53 PM Alex Ratner notifications@github.com wrote:

This might be some evidence that we should experiment w adding the joint factors back in- maybe start by playing around with that a bit? On Tue, Feb 9, 2016 at 7:20 PM Thomas Palomares notifications@github.com wrote:

More analysis should be done to find the correct threshold of expectation for gene_mention to take

— Reply to this email directly or view it on GitHub < https://github.com/HazyResearch/dd-genomics/issues/333#issuecomment-182179727

.

— Reply to this email directly or view it on GitHub https://github.com/HazyResearch/dd-genomics/issues/333#issuecomment-182199417 .

Colossus commented 8 years ago

The joint factors ARE enabled. They have never been disabled. I suggested allowing negative supervised genes in gps again today and there's a separate issue for that

On Feb 9, 2016, at 21:15, chrismre notifications@github.com wrote:

I would not tune the threshold... just add in more factors and training data =) The threshold is meaningful, not something to tune :)

On Tue, Feb 9, 2016 at 8:53 PM Alex Ratner notifications@github.com wrote:

This might be some evidence that we should experiment w adding the joint factors back in- maybe start by playing around with that a bit? On Tue, Feb 9, 2016 at 7:20 PM Thomas Palomares notifications@github.com wrote:

More analysis should be done to find the correct threshold of expectation for gene_mention to take

— Reply to this email directly or view it on GitHub < https://github.com/HazyResearch/dd-genomics/issues/333#issuecomment-182179727

.

— Reply to this email directly or view it on GitHub https://github.com/HazyResearch/dd-genomics/issues/333#issuecomment-182199417 .

— Reply to this email directly or view it on GitHub.

ThomasPalomares commented 8 years ago

@Colossus I think they were speaking about the joint factors as discussed last Friday, which is not enabled yet (we have only entity linking in the inference rules). But sure, let's try joint factors first ! It was more an open remark rather than an issue.