moreymat / omw-graph

The Open Multilingual Wordnet in a graph database
MIT License
4 stars 0 forks source link

Batch english #12

Closed rhin0cer0s closed 10 years ago

rhin0cer0s commented 10 years ago

Last improvement to milestone 0.1. Today we have :

Nodes produced does not contain synset, should we add them ?

We just saw that http://compling.hss.ntu.edu.sg/omw/ now give us xml files. Should we now head toward lmf support or try to test db ( consistency, translation gaps ... ) ?

moreymat commented 10 years ago

We are indeed approaching milestone 0.1.

Whether we shift to XML files now or later depends on @fcbond's answer to issue #2 .

moreymat commented 10 years ago

Discussion on #13 seems to indicate the safest solution for milestone 0.1 is to keep all non-lexicalized synsets and their relations. Future work in 0.2 will aim at removing non-needed nodes and relations.

What do you think, @zorgulle @rhin0cer0s ?

After we decide on this and make changes to the code accordingly, I will:

  1. merge this PR into master
  2. follow the installation and run procedure
  3. if successful, declare milestone 0.1 reached.
rhin0cer0s commented 10 years ago

Okay so I updated INSTALL and some README. Synsets without lexical connection are now inserted into syn-xxx.csv so relations can be built ( even if some are still skipped by the importer, I don't know why. It is only 5 on 248704 I hope it won't be a problem).

I think we are good to go for milestone 0.1.

moreymat commented 10 years ago

OK, I will merge the PR into master now.

@rhin0cer0s could you open an issue for 0.2, tagged "bug", about the skipped synsets?