biosemantics / micropie2

3 stars 5 forks source link

USP (Unsupervised Semantic Parsing) cannot handle sentence with parentheses or brackets #2

Open elviscat opened 10 years ago

elviscat commented 10 years ago

For this kind of sentence: Resistant to kanamycin (30 μg), gentamicin (10 μg), neomycin (30 μg) and polymyxin B (300 μg), but sensitive to ampicillin (10 μg), penicillin (10 IU), streptomycin (10 μg) and tetracycline (30 μg).

It cannot extract the prep_to objects correctly followed by our rule (Keyword: resistant, keyword type: J, extract object type: prep_from, extract method: dep) due to the parentheses or brackets.

The following are sentence's dependency and parse tree:

= = = = = = = = = = SD (Stanford Dependency, Collapsed):

prep_to(Resistant-1, c4-Antibiotics-3) num(μg-6, 30-5) appos(c4-Antibiotics-3, μg-6) appos(Resistant-1, c4-Antibiotics-9) num(μg-12, 10-11) appos(c4-Antibiotics-9, μg-12) nn(c4-Antibiotics-31, c4-Antibiotics-15) num(μg-18, 30-17) appos(c4-Antibiotics-15, μg-18) nn(B-22, c4-Antibiotics-21) conj_and(c4-Antibiotics-15, B-22) nn(c4-Antibiotics-31, B-22) num(μg-25, 300-24) appos(B-22, μg-25) conj_but(c4-Antibiotics-15, sensitive-29) nn(c4-Antibiotics-31, sensitive-29) dep(sensitive-29, to-30) appos(Resistant-1, c4-Antibiotics-31) conj_and(c4-Antibiotics-9, c4-Antibiotics-31) num(μg-34, 10-33) appos(c4-Antibiotics-31, μg-34) appos(Resistant-1, c4-Antibiotics-37) conj_and(c4-Antibiotics-9, c4-Antibiotics-37) num(IU-40, 10-39) appos(c4-Antibiotics-37, IU-40) appos(Resistant-1, c4-Antibiotics-43) conj_and(c4-Antibiotics-9, c4-Antibiotics-43) num(μg-46, 10-45) appos(c4-Antibiotics-43, μg-46) appos(Resistant-1, c4-Antibiotics-49) conj_and(c4-Antibiotics-9, c4-Antibiotics-49) num(μg-52, 30-51) appos(c4-Antibiotics-49, μg-52)

= = = = = = = = = = Parse tree:

(ROOT (NP (NP (JJ Resistant)) (PP (TO to) (NP (NP (NNS c4-Antibiotics)) (PRN (-LRB- -LRB-) (NP (CD 30) (NN μg)) (-RRB- -RRB-)))) (, ,) (NP (NP (NP (NNS c4-Antibiotics)) (PRN (-LRB- -LRB-) (NP (CD 10) (NN μg)) (-RRB- -RRB-))) (, ,) (NP (NP (NP (NP (NP (NP (NNS c4-Antibiotics)) (PRN (-LRB- -LRB-) (NP (CD 30) (NN μg)) (-RRB- -RRB-))) (CC and) (NP (NP (NN c4-Antibiotics) (NN B)) (PRN (-LRB- -LRB-) (NP (CD 300) (NN μg)) (-RRB- -RRB-)))) (, ,) (CC but) (ADJP (JJ sensitive) (TO to))) (NNS c4-Antibiotics)) (PRN (-LRB- -LRB-) (NP (CD 10) (NN μg)) (-RRB- -RRB-))) (, ,) (NP (NP (NNS c4-Antibiotics)) (PRN (-LRB- -LRB-) (NP (CD 10) (NN IU)) (-RRB- -RRB-))) (, ,) (NP (NP (NNS c4-Antibiotics)) (PRN (-LRB- -LRB-) (NP (CD 10) (NN μg)) (-RRB- -RRB-))) (CC and) (NP (NP (NNS c4-Antibiotics)) (PRN (-LRB- -LRB-) (NP (CD 30) (NN μg)) (-RRB- -RRB-)))) (. .)))

= = = = = = = = = =

Current comment: Will put this issue into ToDo list and continuously seek the possible solution.

lmoore207 commented 8 years ago

I believe this issue has been taken care of. If so, perhaps it can be removed from the list of issues?