digling / intelligibility

MIT License
0 stars 0 forks source link

multilingual vectors #1

Closed justalingwist closed 9 months ago

justalingwist commented 1 year ago

The vectors I thought we could use for the project are here: https://github.com/commonsense/conceptnet-numberbatch

LinguList commented 1 year ago

Where are they there? What files? I did not find the concrete links there...

LinguList commented 1 year ago

I see, thanks, they use a shell script to download them. So we do that as well.

justalingwist commented 1 year ago

Sorry, I realised I copied the wrong link to the main page. Here is the download link: https://conceptnet.s3.amazonaws.com/downloads/2019/numberbatch/numberbatch-19.08.txt.gz

LinguList commented 1 year ago

Did you see that, @justalingwist? They provide a shell-script that downloads the data. Yes, the shell script downloads the data. So that is the first step in our workflow.

LinguList commented 1 year ago

Or you make a file that you call Makefile and there you type:

download:
        wget https://conceptnet.s3.amazonaws.com/downloads/2019/numberbatch/numberbatch-19.08.txt.gz
LinguList commented 1 year ago

This should also work on a Mac, typing

make download

In terminal would hten download all data.

LinguList commented 1 year ago

Better is:

download:
        curl -o "raw/numberbatch-1908.txt" https://conceptnet.s3.amazonaws.com/downloads/2019/numberbatch/numberbatch-19.08.txt.gz

curl is often already installed on mac, you must add a folder raw in the folder.