tatuylonen / wiktextract

Wiktionary dump file parser and multilingual data extractor
Other
741 stars 82 forks source link

Canonical form for Czech 'pro' contains grammar note #667

Closed StefanVukovic99 closed 6 days ago

StefanVukovic99 commented 3 weeks ago

https://en.wiktionary.org/wiki/pro#Czech image image image

(downstream: https://github.com/themoeway/kaikki-to-yomitan/issues/60)

kristian-clausal commented 3 weeks ago

Oof, this is from a template Template:+obj, which is so vague that it's used a bit in all sorts of places. Usually, I'd just leave the output alone in stuff like glosses, but in forms, yeah... Going to be mildly annoying, because there are so many other kludges in that space of the code already.

kristian-clausal commented 3 weeks ago

I've a PR draft up handling this specific thing. Tatu green-lighted adding a new field to the en output, which I'm tentatively calling "coincidence" (because it's about coincident things and terms, because the template is so vague).

The name was a coincidence. I looked for synonyms, saw "coincident", though that was perfect, and didn't realize until later that "coincidence" will be read as "coincidence", not as "coincidence" most of the time.

kristian-clausal commented 3 weeks ago

We're not going to do a kludge for just this template, but make something a bit more generalizable, similar to etym_templates instead.

xxyzz commented 1 week ago

Fixed in #687