lexibank / robbeetstriangulation

CLDF dataset derived from Robbeets et al.'s "Triangulation of the Transeurasian Languages" from 2021
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Key error lexibank.makecldf (concepticon) #18

Open DavidSnee opened 2 days ago

DavidSnee commented 2 days ago

Hi Mattis,

As you mentioned might happen, the code seems to be giving a key error to indicate that the list by robbeets diverges from the Oskolskaya version. Here is the traceback I am getting when I run the lexibank.makecldf command. When I check it manually in the "Oskolskaya 2021 254" Concepticon list I can see that the concept for "breast" is written differently in Concepticon, so the code seems to run fine up to that point. They mentioned in the paper specifically that they wrote that concept differently as they wanted to represent "chest" and "breast" as the same concept, so there might not be many cases of divergence.

(.venv) PS C:\Users\david\PycharmProjects\robbeetstriangulation\robbeetstriangulation> cldfbench lexibank.makecldf C:\Users\david\PycharmProjects\robbeetstriangulation\robbeetstriangulation\lexibank_robbeetstriangulation.py --glottolog C:\Users\david\PycharmProjects\CLDF_DATASET_EXAMPLE_3\ExtractingWordlistsFromDictionaries\vonprincedaakaka\glottolog INFO running _cmd_makecldf on robbeetstriangulation ... C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Lib\site-packages\csvw\dsv.py:284: UserWarning: Duplicate column names! warnings.warn('Duplicate column names!') Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Scripts\cldfbench.exe__main.py", line 7, in File "C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Lib\site-packages\cldfbench\main__.py", line 89, in main return args.main(args) or 0 ^^^^^^^^^^^^^^^ File "C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Lib\site-packages\pylexibank\commands\makecldf.py", line 24, in run with_dataset(args, 'makecldf', dataset=dataset) File "C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Lib\site-packages\cldfbench\cli_util.py", line 161, in with_dataset res = func(*arg, args) ^^^^^^^^^^^^^^^^ File "C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Lib\site-packages\pylexibank\dataset.py", line 221, in _cmd_makecldf super()._cmd_makecldf(args) File "C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Lib\site-packages\cldfbench\dataset.py", line 210, in _cmd_makecldf self.cmd_makecldf(args) File "C:\Users\david\PycharmProjects\robbeetstriangulation\robbeetstriangulation\lexibank_robbeetstriangulation.py", line 93, in cmd_makecldf Parameter_ID=concepts[concept],


KeyError: 'breast (n.)'
LinguList commented 1 day ago

There you see how one can see in the future :-)

Let us check Oskolskaya 2021 and see what they write for fire.

LinguList commented 1 day ago

Ah, they write fire (n.), the problem is somewhere else. The code I gave you targets the field Number, I think.

LinguList commented 1 day ago

No, it does NOT. So can I ask you to point me to the branch, where your code is?

DavidSnee commented 1 day ago

It seems to have run without any issues after using the code from your previous PR. I will add the branch as a PR. From what I can tell, all concepts now have a Concepticon ID. Thank you.

(.venv) PS C:\Users\david\PycharmProjects\robbeetstriangulation\robbeetstriangulation> cldfbench lexibank.makecldf C:\Users\david\PycharmProjects\robbeetstriangulation\robbeetstriangulation\lexibank_robbeetstriangulation.py --glottolog C:\Users\david\PycharmProjects\CLDF_DATASET_EXAMPLE_3\ExtractingWordlistsFromDictionaries\vonprincedaakaka\glottolog INFO running _cmd_makecldf on robbeetstriangulation ... C:\Users\david\PycharmProjects\robbeetstriangulation.venv\Lib\site-packages\csvw\dsv.py:284: UserWarning: Duplicate column names! warnings.warn('Duplicate column names!') INFO file written: C:/Users/david/PycharmProjects/robbeetstriangulation/robbeetstriangulation/cldf/.transcription-report.json INFO Summary for dataset C:\Users\david\PycharmProjects\robbeetstriangulation\robbeetstriangulation\cldf\cldf-metadata.json