own-pt / rte-sick

RTE Experiment
1 stars 3 forks source link

Stuff to add to FreeLing #66

Open vcvpaiva opened 7 years ago

vcvpaiva commented 7 years ago

@arademaker added snowsuit to Freeling and Padro said it should be a compound. there are many others. listing them here backbend jetski

arademaker commented 7 years ago

After added the snowsuit in the English dict we have:

# echo 'Two people in snowsuits laying in the snow making snow angels.' | analyze -f en.cfg
Two 2 Z 0.99991 -
people people NNS 0.499927 07942152-n:0.00801849/08160276-n:0.00742026/08180190-n:0.00705196/07971141-n:0.00635396
in in IN 0.987107 -
snowsuits snowsuit NNS 1 04252560-n:0.0259492
laying lay VBG 1 01494310-v:0.00663939/01544692-v:0.00614074/01545079-v:0.00574547/01651972-v:0.00563151/02307261-v:0.00541222
in in IN 0.987107 -
the the DT 1 -
snow snow NN 0.962264 15043763-n:0.00747162/11508382-n:0.00731242/03066743-n:0.00656597/11307082-n:0.00649843
making make VBG 0.932973 01617192-v:0.00147887/00120316-v:0.00129815/02621395-v:0.00129683/02745332-v:0.00114225/02665124-v:0.00113237/01755816-v:0.000997158/01621555-v:0.000973614/02020590-v:0.000920043/01654628-v:0.000825958/01664172-v:0.000801833/02598483-v:0.000782597/00698256-v:0.00077354/00770437-v:0.000769178/00730758-v:0.000765123/00698104-v:0.000765122/01653873-v:0.000748076/01755504-v:0.000724352/01640207-v:0.000716487/02289295-v:0.000694905/00276068-v:0.000671645/02355596-v:0.000669196/00072012-v:0.000651992/00074038-v:0.000640173/00556855-v:0.000619977/01645601-v:0.000619338/00838524-v:0.000616114/01733477-v:0.000612129/02051031-v:0.0005932/00665476-v:0.000587209/01428578-v:0.000584381/02748627-v:0.000581578/02748759-v:0.0005814/02582921-v:0.000580658/01646075-v:0.000578495/02396716-v:0.000571256/00012267-v:0.000562772/02560585-v:0.000560939/02021532-v:0.000560252/01619014-v:0.000556686/02075857-v:0.000556253/02022162-v:0.000553187/00562067-v:0.000552732/00545953-v:0.000551557/00107369-v:0.000550027/02674708-v:0.000542244/00891038-v:0.000542051/02621133-v:0.000527112/02134050-v:0.000515284/00562182-v:0.000511604
snow snow NN 0.962264 15043763-n:0.00747162/11508382-n:0.00731242/03066743-n:0.00656597/11307082-n:0.00649843
angels angel NNS 1 09538915-n:0.0073699/10546850-n:0.00713584/09197660-n:0.00662789/09793717-n:0.00659242
. . Fp 1 -

I haven't added the `backbend(s)' but turned on the compound analysis:

# echo 'Four children are doing backbends in the gym.' | analyze -f en.cfg
Four 4 Z 0.999648 -
children child NNS 1 09917593-n:0.00955591/09918248-n:0.0082921/09918554-n:0.00816267/09918762-n:0.00767976
are be VBP 1 02604760-v:0.00343593/02655135-v:0.00277136/02702508-v:0.00272231/02603699-v:0.00271386/02445925-v:0.0026967/02620587-v:0.0026879/02744820-v:0.00248675/02664769-v:0.00247956/02268246-v:0.00246787/02614181-v:0.00243165/02697725-v:0.00242333/02749904-v:0.00232072/02616386-v:0.00230945
doing do VBG 0.999182 01712704-v:0.00286462/02561995-v:0.0027777/00010435-v:0.0027659/02669789-v:0.00276031/02568672-v:0.0027397/01645601-v:0.00254672/02523221-v:0.00253454/02617567-v:0.00253097/01619014-v:0.00251956/02560585-v:0.00250854/02709107-v:0.00249544/00038849-v:0.00247717/01841772-v:0.00234771
backbends back_bend NNS 0.72824 -
in in IN 0.987107 -
the the DT 1 -
gym gym NN 1 03472112-n:0.0314372
. . Fp 1 -

The only problem is that the analyze utility does not provide the kind of analysis that we used to each word. See https://github.com/TALP-UPC/FreeLing/issues/33

@fcbr may add this in the python code.

arademaker commented 7 years ago

Waiting for more info from Padro about the compound module:

https://github.com/TALP-UPC/FreeLing/pull/44