Closed kristian-clausal closed 1 year ago
Looking at our kaikki regen script, wiktwords is run with --categories-file so the process will crash and kaikki will not regenerate.
https://github.com/xxyzz/wiktextract/commit/e7602502a5bab97412bef7f2dfffef5ee2e21d94 fixes this error. I'm also fixing --modules-file
and --templates-file
options. I'll create a pull request later.
Thanks, I was worrying that it would be a bigger thing with checking for db connections in add_page or similar.
wiktwords --all-languages --all --db-path wikt-db --pages-dir pages --categories-file categories-test.json dumps/enwiktionary-20230420-pages-articles.xml.bz2
Testing out creating a database file and pages directory, resulting in:
Looking at what wiktwords is actually doing there, it's the --category-file parameter that was left over when ctrl-R'ed for this command in my history. ctx.add_page() needs to be checked to see if it is being called on a closed database like here, but the full run on the kaikki regen machine seems to be running fine so hopefully kaikki will regenerate well.