tatuylonen / wiktextract

Wiktionary dump file parser and multilingual data extractor
Other
741 stars 82 forks source link

Add `info_templates` field, handle `Template:+obj` #687

Closed kristian-clausal closed 1 week ago

kristian-clausal commented 1 week ago

info_templates is similar to etym_templates and head_templates, but can also have an extra_data field with more specific data relating to the template in question.

For +obj, a vague template that generates text like [+accusative] often meaning 'used with accusative objects' or [+dative = means this things] meaning "if this is with dative it means this thing", but sometimes not, this extra data consists of a meaning field from the means= parameter, and tags and words for parsing the output that isn't the means= field.

xxyzz commented 1 week ago

I squashed the commits and applied a few minor changes suggested by ruff.

kristian-clausal commented 1 week ago

I tried to figure out some way to improve this, but I wasn't getting anywhere, so yeah, a good point to merge. I already thought I removed the I001 ignore from pyproject.toml, but it seems I didn't; might be I forgot to git add it when merging stuff. Ugh... Thanks for catching it.