grambank / pygrambank

Apache License 2.0
4 stars 1 forks source link

check that glottocode is valid #12

Closed SimonGreenhill closed 4 years ago

SimonGreenhill commented 4 years ago

Just trying to build the CLDF dataset and I get this traceback:

Traceback (most recent call last):
  File "/Users/simon/projects/grambank/env/bin/grambank", line 11, in <module>
    load_entry_point('pygrambank', 'console_scripts', 'grambank')()
  File "/Users/simon/projects/grambank/pygrambank/src/pygrambank/__main__.py", line 139, in main
    return args.main(args) or 0
  File "/Users/simon/projects/grambank/pygrambank/src/pygrambank/commands/cldf.py", line 35, in run
    create(args.repos, glottolog.dir, args.wiki_repos, args.cldf_repos)
  File "/Users/simon/projects/grambank/pygrambank/src/pygrambank/cldf.py", line 182, in create
    lang = glottolog.languoids_by_glottocode[sheet.glottocode]
KeyError: 'seco1242'

Looks like seco1242 isn't a real glottocode, it would be nice if this was caught earlier.

SimonGreenhill commented 4 years ago

Note: seco1242 is a newly minted glottocode that's not been released yet

xrotwang commented 4 years ago

If you don't pass --glottolog-version and checkout HEAD of glottolog/glottolog, things should work.

xrotwang commented 4 years ago

@SimonGreenhill what do you mean by "earlier"? Upon adding the sheet to the repository - or just earlier in the grambank cldf command?

SimonGreenhill commented 4 years ago

sorry, I meant during 'grambank check' as this is where the lists get incorporated?

xrotwang commented 4 years ago

Could do this - but it would slow grambank check down considerably.

SimonGreenhill commented 4 years ago

Yeah, I guess we run grambank check multiple times when adding a language (or at least I do - run, fix error, repeat until the list has no errors), so the cost here would be very annoying. I'll close this, but have opened an issue in pyglottolog: https://github.com/glottolog/pyglottolog/issues/28

xrotwang commented 4 years ago

Better leave it open, since even when glottolog/pyglottolog#28 is fixed, this requires some action here.

SimonGreenhill commented 4 years ago

can this be closed?

xrotwang commented 4 years ago

Think so.

Simon J Greenhill notifications@github.com schrieb am Mo., 21. Sep. 2020, 20:50:

can this be closed?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/grambank/pygrambank/issues/12#issuecomment-696302420, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUOKCIULW7OJJR7HTFMSTSG6N6BANCNFSM4MDXS2CA .

HedvigS commented 4 years ago

Indirectly the sourcelookup checks this in a way, since if the glottocode is invalid it should break in some way or say that all references are wrong, I think.