own-pt / rte-sick

RTE Experiment
1 stars 3 forks source link

Missing from FreeLing? #49

Open vcvpaiva opened 7 years ago

vcvpaiva commented 7 years ago

The verb "laze" is in PWN:

02417504-v slug, idle, stagnate, laze
    (be idle; exist in a changeless situation; "The old man sat and stagnated on his porch"; "He slugged in bed all morning")

but the processing doesn't find it for the sentence "A man is lazing."

1 A a DET DT 2 det DT|?|? 2 man man NOUN NN 4 nsubj NN|02472293-n|Hominid= 3 is be VERB VBZ 4 aux VBZ|02604760-v|Entity+ 4 lazing lazing VERB VBG 0 ROOT VBG|?|?

vcvpaiva commented 7 years ago

the noun "snowsuit" is in PWN 04252560-n snowsuit (a child's overgarment for cold weather) but it's not found for the sentences it appears in, e.g.

Two people in snowsuits are lying in the snow and making snow angels. 1 Two 2 NUM CD 2 num Z|?|? 2 people people NOUN NNS 6 nsubj NNS|07942152-n|GroupOfPeople= 3 in in ADP IN 2 prep IN|?|? 4 snowsuits snowsuits NOUN NNS 3 pobj NNS|?|? 5 are be VERB VBP 6 aux VBP|02604760-v|Entity+ 6 lying lie VERB VBG 0 ROOT VBG|02690708-v|PhysicalAttribute+ 7 in in ADP IN 6 prep IN|?|? 8 the the DET DT 9 det DT|?|? 9 snow snow NOUN NN 7 pobj NN|15043763-n|Snowing= 10 and and CONJ CC 6 cc CC|?|? 11 making make VERB VBG 6 conj VBG|02621395-v|Attribute+ 12 snow snow NOUN NN 13 nn NN|15043763-n|Snowing= 13 angels angel NOUN NNS 11 dobj NNS|09538915-n|Angel= 14 . . . . 6 punct Fp|?|?

vcvpaiva commented 7 years ago

@fcbr am I right that these are lemmatization issues?

arademaker commented 7 years ago

The verb laze is not in FL dictionary, I suspect that FL Unknown Word Guesser Module was used.

$ grep laze *
adjs:ablaze ablaze JJ
noms:blaze blaze NN
noms:blazer blazer NN
noms:blazers blazer NNS
noms:blazes blaze NNS
noms:trailblazers trailblazer NNS
noms:trailblazer trailblazer NN
verbs:blaze blaze VB
verbs:blaze blaze VBP
verbs:blazed blaze VBD
verbs:blazed blaze VBN
verbs:blazes blaze VBZ
verbs:blazing blaze VBG
verbs:glazed glaze VBD
verbs:glazed glaze VBN
verbs:glaze glaze VB
verbs:glaze glaze VBP
verbs:glazes glaze VBZ
verbs:glazing glaze VBG

urca:entries arademaker$ grep lazing *
noms:glazing glazing NN
noms:glazings glazing NNS
verbs:blazing blaze VBG
verbs:glazing glaze VBG
arademaker commented 7 years ago

Just made a PR to Padro: https://github.com/TALP-UPC/FreeLing/pull/44

vcvpaiva commented 7 years ago

Thanks! I wonder if they have tried to have everything in PWN and a few escaped, or if they didn't even try to get in sync. do you know?

vcvpaiva commented 7 years ago

Here are some others that I posted in #52, from @arademaker's original list.

Should we care to add to FreeLing the ones that don't exist in PWN? shirtless, biker, corndogs, etc?

listing the ones I know are present in PWN:

13 (occs) scissors NOUN (lemmatization issue) 
11 snowboarding VERB (both noun and verb in PWN, not the sport though)
9 biking VERB
8 backbends NOUN
7 tambourines NOUN
5 inflatable ADJ
4 rollerbladers NOUN
4 pajamas NOUN
4 gymnastic ADJ (same as scissors? pwn only has plural?)
3 deboning VERB should be debone

7 wetsuit NOUN (only wet_suit in PWN, maybe they need to add it there) same as 01940248-v water_ski (ride water skis) bellbottoms are in PWN as the adjective (of trousers) but not the noun