nltk / wordnet

Stand-alone WordNet API
Other
48 stars 15 forks source link

Using a database #8

Closed frankier closed 4 years ago

frankier commented 5 years ago

Firstly, this refactoring and improvements is surely a good idea! Having used this code as it was in NLTK, overall improvement is a great idea. As you mentioned in a comment on the NLTK tracker, perhaps something that loaded things into an actual database would be even better. It should be both prettier and faster than seeking through a bunch of files. It should also be possible to offer the same interface on top of the raw file or the database if desired. So I thought I'd open an issue to track it. I'm quite a fan of SQLAlchemy for that stuff since that way you can easily switch between SQLite and PostgreSQL.

BTW apart from Francis Bond's group's database for Japanese WordNet I found some other stuff:

https://github.com/aistairc/trf/blob/master/trf/wordnet.py https://sourceforge.net/projects/sqlunet/

alvations commented 5 years ago

Yes, there's some considerations to change the NLTK API to the sql database. Discussion with @fcbond but further work is necessary before it can officially replace the NLTK's wordnet API.

Regardless, the WNDB format is supported somehow by https://github.com/globalwordnet/english-wordnet and a proper parser needs to be written, thus this library.

The most painful critique of the NLTK parser is that it reads the definition and example sentence wrongly and that's not acceptable =)

goodmami commented 4 years ago

@frankier for a SQL-backed wordnet library see here: https://github.com/goodmami/wn

It is a completely new library so some things are fundamentally different from the NLTK or this repo, but some of the API is deliberately the same so as to ease the transition. Work is ongoing but it's pretty functional now.

goodmami commented 4 years ago

I'm going to close this since I don't expect this module to also develop a SQL back-end. For that, see the other repo.