This is a spanish version of Semantle.
data
directory. Unzip itCREA_total.ZIP
) from Corpus de Referencia del Español Actual (CREA) - Listado de frecuencias to the data
directory. Do not unzip itpython3 -m venv .
source bin/activate
python3 -m pip install -r requirements.txt
python3 dump-vecs.py
. Takes ~5min in a 2.4 GHz Intel Core i5 MacBook Propython3 dump-hints.py
. Takes ~30mins in a 2.4 GHz Intel Core i5python3 store-hints.py
. Fast.british.py
python3 semantle.py
TBD
./start_server_prod.sh
Original Semantle code by David Turner. Changes:
dump-hints.py
performanceWord2vec data set by Cristian Cardellino. Citation:
Cristian Cardellino: Spanish Billion Words Corpus and Embeddings (March 2016), https://crscardellino.github.io/SBWCE/
Frequent words data set from Corpus de referencia del español actual. Citation:
REAL ACADEMIA ESPAÑOLA: Banco de datos (CREA) [en línea]. Corpus de referencia del español actual. http://www.rae.es [2022-02-25]