lexibank / pylexibank

The python curation library for lexibank
Apache License 2.0
18 stars 7 forks source link

lexibank script not running after a year #270

Closed martino-vic closed 1 year ago

martino-vic commented 1 year ago

In this repo the cldf-conversion worked fine a year ago. I made no changes to it and now the conversion fails. I assume it must have to do something with some dependecies that have changed.

When I create a fresh venv, clone the repo, run $ pip install -e gerstnerhungarian and then $ bash hun.sh, which contains the line cldfbench lexibank.makecldf lexibank_gerstnerhungarian.py I get this error message:

bash hun.sh
INFO    running _cmd_makecldf on gerstnerhungarian ...
Traceback (most recent call last):
  File "/home/viktor/Documents/GitHub/venv/bin/cldfbench", line 8, in <module>
    sys.exit(main())
  File "/home/viktor/Documents/GitHub/venv/lib/python3.8/site-packages/cldfbench/__main__.py", line 89, in main
    return args.main(args) or 0
  File "/home/viktor/Documents/GitHub/venv/lib/python3.8/site-packages/pylexibank/commands/makecldf.py", line 24, in run
    with_dataset(args, 'makecldf', dataset=dataset)
  File "/home/viktor/Documents/GitHub/venv/lib/python3.8/site-packages/cldfbench/cli_util.py", line 161, in with_dataset
    res = func(*arg, args)
  File "/home/viktor/Documents/GitHub/venv/lib/python3.8/site-packages/pylexibank/dataset.py", line 218, in _cmd_makecldf
    super()._cmd_makecldf(args)
  File "/home/viktor/Documents/GitHub/venv/lib/python3.8/site-packages/cldfbench/dataset.py", line 208, in _cmd_makecldf
    self.cmd_makecldf(args)
  File "/home/viktor/Documents/GitHub/gerstnerhungarian/./lexibank_gerstnerhungarian.py", line 60, in cmd_makecldf
    idx = "{0}-{1}".format(concept.number, slug(concept.gloss))
  File "/home/viktor/Documents/GitHub/venv/lib/python3.8/site-packages/clldutils/misc.py", line 178, in slug
    res = ''.join(c for c in unicodedata.normalize('NFD', s)
TypeError: normalize() argument 2 must be str, not None

Earlier I had problems with line 59 in lexibank_gerstnerhungarian.py, where self.conceptlists[0].concepts triggered an error.

Surprsingly, the tests are still passing

LinguList commented 1 year ago

The error is easy to find, right?

LinguList commented 1 year ago

You know how to read a traceback? It shows the error is on your side, not on the lexibank side.

xrotwang commented 1 year ago

Surprsingly, the tests are still passing

The tests only check the validity of the generated CLDF - they don't run the whole workflow.

martino-vic commented 1 year ago

Surprsingly, the tests are still passing

The tests only check the validity of the generated CLDF - they don't run the whole workflow.

Oh, this explains a lot, thank you!

martino-vic commented 1 year ago

The error is easy to find, right?

Hmm, I haven't found it yet

martino-vic commented 1 year ago

You know how to read a traceback? It shows the error is on your side, not on the lexibank side.

Hmm I'm just wondering why the error is on my side, if nothing changed on my side

xrotwang commented 1 year ago

You are relying on concept glosses in the concept list being non-empty: https://github.com/martino-vic/gerstnerhungarian/blob/3c982ec49b6cdca45d592f69e2a32966491687d3/lexibank_gerstnerhungarian.py#L60

If you run the CLDF creation against a different version of Concepticon, this may explain the problem.

martino-vic commented 1 year ago

You are relying on concept glosses in the concept list being non-empty: https://github.com/martino-vic/gerstnerhungarian/blob/3c982ec49b6cdca45d592f69e2a32966491687d3/lexibank_gerstnerhungarian.py#L60

If you run the CLDF creation against a different version of Concepticon, this may explain the problem.

Wonderful. This fixed my issue immediately, thank you so much! Only had to add --concepticon-version=v2.5.0 --glottolog-version=v4.5 --clts-version=v2.2.0 to my shell script. Will add these flags by default from now on.

martino-vic commented 1 year ago

Surprsingly, the tests are still passing

The tests only check the validity of the generated CLDF - they don't run the whole workflow.

Just wrote a config-file that runs the whole workflow on circleci, to avoid these kind of errors in the future.