clips / wordkit

Featurize words into orthographic and phonological vectors.
GNU General Public License v3.0
40 stars 10 forks source link

Add more corpora #3

Closed stephantul closed 6 years ago

stephantul commented 6 years ago

We currently offer a nice selection of corpora, but don't offer:

Adding Lexique would especially be nice because this would give us a Celex-like database for French (e.g. including syllables and frequency counts)

stephantul commented 6 years ago

Most of the SUBTLEX corpora have been added.

stephantul commented 6 years ago

the CELEX word corpora have been added.

stephantul commented 6 years ago

Added Lexique

stephantul commented 6 years ago

Added Buscar Palabras (bpal), giving us a Spanish corpus.