LoanDB / ronataswestoldturkic

CLDF dataset derived from 'West Old Turkic' by András Róna-Tas and Árpád Berta from 2011
https://www.harrassowitz-verlag.de/title_4002.ahtml
Creative Commons Attribution 4.0 International
0 stars 0 forks source link

Add Cuman to languages #13

Closed martino-vic closed 2 years ago

martino-vic commented 2 years ago

This throws following error:

Traceback (most recent call last): File "/home/viktor/Documents/cldfvenv3.9/bin/cldfbench", line 8, in sys.exit(main()) File "/home/viktor/Documents/cldfvenv3.9/lib/python3.9/site-packages/cldfbench/main.py", line 84, in main return args.main(args) or 0 File "/home/viktor/Documents/cldfvenv3.9/lib/python3.9/site-packages/pylexibank/commands/makecldf.py", line 24, in run with_dataset(args, 'makecldf', dataset=dataset) File "/home/viktor/Documents/cldfvenv3.9/lib/python3.9/site-packages/cldfbench/cli_util.py", line 153, in with_dataset res = func(*arg, args) File "/home/viktor/Documents/cldfvenv3.9/lib/python3.9/site-packages/pylexibank/dataset.py", line 218, in _cmd_makecldf super()._cmd_makecldf(args) File "/home/viktor/Documents/cldfvenv3.9/lib/python3.9/site-packages/cldfbench/dataset.py", line 206, in _cmd_makecldf self.cmd_makecldf(args) File "/home/viktor/Documents/GitHub/rtbwestoldturkic/lexibank_rtbwestoldturkic.py", line 154, in cmd_makecldf lex["ProsodicStructure"] = prosodic_string(lex["Segments"], _output='cv') File "/home/viktor/Documents/cldfvenv3.9/lib/python3.9/site-packages/lingpy/sequence/sound_classes.py", line 881, in prosodic_string [int(t) for t in tokens2class(string, rcParams['art'], File "/home/viktor/Documents/cldfvenv3.9/lib/python3.9/site-packages/lingpy/sequence/sound_classes.py", line 792, in tokens2class raise ValueError("[!] your sequence contains only unknown characters") ValueError: [!] your sequence contains only unknown characters

martino-vic commented 2 years ago

Actually: Don't add Cuman to languages. It's enough when it's covered in the raw file but is ignored by the lexibank-script. Since in my own research I need only WOT, not Cuman. And because there are only a handful of Cuman words, with which one can't really solve any quantitative tasks anyways. Besides, the original dictionary contains heaps and heaps of other languages, that are currently ignored here as well. But for this I have opened https://github.com/martino-vic/rtbwestoldturkic/issues/10 already