Closed LinguList closed 3 years ago
Thank you Mattis.
Downloading glottolog as I write. Created a different env and directory just for this… and similar work.
Already had some of these downloaded and installed, but better to start from scratch here for now I think.
I noticed in the South American WOLD files that several languages had IDS-IDs that did not map to a concepticon_ID. Just like Spanish ‘calma’ does not map to a concepticon_ID, but many more for some languages.
Es una maravilla tener todas las herramientas.
Steep learning curve to have all this on hand and understand how to use it effectively. But I see from what you did on the Saphon data, how effective one can be with access and mastery of these tools.
OK here are install results
Received this warning during installation of keypano both for raw/ids/ and for ./ WARNING: Value for scheme.headers does not match. Please report this to https://github.com/pypa/pip/issues/9617 distutils: /Users/johnmiller/opt/miniforge3/envs/ling/include/python3.9/UNKNOWN sysconfig: /Users/johnmiller/opt/miniforge3/envs/ling/include/python3.9 WARNING: Additional context: user = False home = None root = None prefix = None OK
Ooooops. Error in execution of final command cldfbench install. I changed reference from ../concepticon to ../concepticon-data and command began processing, but then errored out with this trace:
(ling) johnmiller@Johns-M1-Fractal-Dragon keypano % cldfbench lexibank.makecldf --concepticon=../concepticon-data --clts=../clts --glottolog=../glottolog --concepticon-version=v2.4.0 --glottolog-version=v4.3 --clts-version=v2.1.0 lexibank_keypano.py
INFO running _cmd_makecldf on keypano ...
INFO added sources
INFO added languages
Traceback (most recent call last):
File "/Users/johnmiller/opt/miniforge3/envs/ling/bin/cldfbench", line 8, in
OK, maybe it doesn’t like that when I created the env it defaulted to Python 3.9 [Which runs native on my M1].
Maybe tomorrow (Sunday) I’ll create an env with earlier Python 3 to see what conspires.
I’ll keep you posted. OK, a zoom meeting with Roberto on our FST morphology paper! And then a movie on netflix.
Buen Domingo Mattis!
John Miller @.***
On Apr 24, 2021, at 2:09 PM, Johann-Mattis List @.***> wrote:
I made a preliminary orthography profile for the 20 odd languages. @fractaldragonflies https://github.com/fractaldragonflies, to run this code, please do:
$ git clone https://github.com/intercontinental-dictionary-series/keypano.git $ git clone https://github.com/concepticon/concepticon-data.git $ git clone https://github.com/glottolog/glottolog.git $ git clone https://github.com/cldf-clts/clts.git $ cd keypano $ git submodule init $ git submodule update $ pip install -e raw/ids/ $ pip install -e ./ $ cldfbench download lexibank_keypano.py $ cldfbench lexibank.makecldf --concepticon=../concepticon --clts=../clts --glottolog=../glottolog --concepticon-version=v2.4.0 --glottolog-version=v4.3 --clts-version=v2.1.0 lexibank_keypano.py In this way, you can check progress on the orthography profile in etc/orthography.tsv, and the special-language-profiles in etc/orthography/Spanish.tsv.
The latter is currently being downloaded, using the script in raw/getphonetics.py. This can also be tweaked to account for Portuguese. Download is item-by-item and slow. But we only need the phonetics once.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/intercontinental-dictionary-series/keypano/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIVSLTV2DKMPQ3JPGHJMWLLTKMJNVANCNFSM43QNTULQ.
The problem is not python 3.9, I also use 3.9.2. The problem was I added the specific Spanish profile later, which is not finished yet. Now that I fixed this temporarily, you can start ;)
Success…
I suppose the reported segment errors (Segments: 167 (72 BIPA errors, 72 CTLS sound class errors, 95 CLTS modified) are expected.
(ling) johnmiller@Johns-M1-Fractal-Dragon keypano % cldfbench lexibank.makecldf --concepticon=../concepticon-data --clts=../clts --glottolog=../glottolog --concepticon-version=v2.4.0 --glottolog-version=v4.3 --clts-version=v2.1.0 lexibank_keypano.py
INFO running _cmd_makecldf on keypano ...
INFO added sources
INFO added languages
INFO file written: cldf/.transcription-report.json
INFO Summary for dataset cldf/cldf-metadata.json
John Miller @.***
On Apr 25, 2021, at 8:25 AM, Johann-Mattis List @.***> wrote:
The problem is not python 3.9, I also use 3.9.2. The problem was I added the specific Spanish profile later, which is not finished yet. Now that I fixed this temporarily, you can start ;) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/intercontinental-dictionary-series/keypano/issues/2#issuecomment-826324288, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIVSLTQFNLLS2CI6JTKVDZLTKQJ3VANCNFSM43QNTULQ.
Yes, for sure, we probably need to ignore Spanish and Portugues first and introduce them one time later. My plan is then to put the dataset on EDICTOR, where we can annotate borrowings manually, and cognates as well, to have a better test set.
I made a preliminary orthography profile for the 20 odd languages. @fractaldragonflies, to run this code, please do:
In this way, you can check progress on the orthography profile in
etc/orthography.tsv
, and the special-language-profiles inetc/orthography/Spanish.tsv
.The latter is currently being downloaded, using the script in
raw/getphonetics.py
. This can also be tweaked to account for Portuguese. Download is item-by-item and slow. But we only need the phonetics once.