JMdictProject / JMdictIssues

JMdict Japanese dictionary - lexicographic, etc. issues management
16 stars 1 forks source link

Proposed new Field/Misc tags - August 2023 #98

Closed JMdictProject closed 7 months ago

JMdictProject commented 10 months ago

I'm setting up a consolidated list of proposed additional Field/Misc tags; drawing together suggestions in other issues (which I'll close).

For the names dictionary, a "test/examination" name type has been suggested in #96. Perhaps [exam] would do.

For JMdict itself, the following field tags have been suggested:

Feedback welcome.

robinjmdict commented 10 months ago

When I proposed an Internet tag, I was thinking mainly about terms concerning the World Wide Web (including web-related services and activities), e.g. ホームページ, スレッド (on a forum), レス, リンク切れ, アクセスカウンタ, フォロー, リツイート. I don't think "networking" would be appropriate for these.

I wouldn't have any objection to using [internet] for both WWW terms and network terms (like DNS, FTP). Most people nowadays understand the word "Internet" to include the world wide web. The vast majority of Internet-tagged terms on Wiktionary have nothing to do with the underlying network.

A few more field tags I'd like to see in JMdict:

JMdictProject commented 10 months ago

OK of all of those suggestions. I'll drop [netw] and put in [internet] instead, which will cover both underlying Internet protocols, etc, such as TCP, FTP, POP3, etc. and Internet-based applications such as the WWW. Many existing [comp] entries should be migrated to this new field.

JMdictProject commented 10 months ago

Also proposing to add a [dial] (dialectal) misc tag to cover cases which can't be identified with a particular regional dialect.

JMdictProject commented 9 months ago

I think a [depr] (deprecated) misc tag would be handy to have along with [dated/sens/derog]. See the recent discussion on the 看護婦 and 浮浪者 entries,

stephenmk commented 9 months ago

My initial thought is that something like that would be better suited as a cross reference type than a miscellaneous tag. 看護婦 deprecated in favor of 看護師, 浮浪者 deprecated in favor of ホームレス, etc.

robinjmdict commented 9 months ago

I'm not sure about a tag. One of my ideas for JMdict:NG is notes with in-line cross references. It would give us a lot of flexibility, and this is the sort of thing they could be used for. e.g. [note="deprecated in favour of 看護師(#1928100)"]

By the way, is "deprecated" the right word here? I'm only familiar with it in software contexts.

yamagoya commented 9 months ago

Much of the power that the database provides is that the data in it is structured: a particular kind of data can always be found in a particular place in a particular format and that structure is enforced by the database, not by the volition of the person entering it.

Ad hoc notes provide too much flexibility. One person enters [note="deprecated in favour of 看護師(#1928100)"], another enters [note="deprecated, better is 看護師 #1923100"], yet a third enters [note="not recommended in favor of 看護師"]. Yes, editors may reformat those but editors make mistakes too... note the typo in the second example.

For that reason I think an xref as suggested by stephenmk would be more in keeping with the design philosophy of the database. An xref automatically checks that the referred-to entry exists, that the kanji or reading match, etc. and automatically maintains those even should the target entry be changed.

The possible future need for such an xref was actually anticipated quite a while ago. See the "pref" item in the xref types list at https://www.edrdg.org/jmwsgi/edhelp.py#kw_xref. (Of course the name/description can be adjusted as needed.)

A constraint on using an xref for this is that there would always need to be some preferred entry.

stephenmk commented 9 months ago

Might not hurt to have a [rakugo] tag. Sankoku has one, although it's only used in a handful of entries.

JMdictProject commented 9 months ago

OK, I'll hold off on a [depr] tag. I'm not sure there's enough rakugo terminology to justify a field tag.

briankrznarich commented 8 months ago

[anth], [kigo]

How about: anthropology [anth] I can't say how commonly such a tag would be used, but it seems overlooked on our list of sciences. I'm sure there was one time I looked for it.

There are a lot of very specific [rare] terms that survive largely due to their use as 季語. Two examples that come to mind are: 厩出(し) releasing livestock to graze in spring​[rare] and 肌寒 chill (esp. in autumn); chilliness​ [rare]

If we could check a box and search all the seasonal haiku terms, that would surely be a neat trick too.

[haiku] (easy to remember, if inaccurate), or perhaps [kigo] (perhaps too narrow): 季語 きご seasonal word (in haiku) (common term in Japanese)

JMdictProject commented 8 months ago

I don't think there are enough terms specific to anthropology to warrant a field tag. I see GG5 doesn't have one for it. The word "anthropology" appears is a number of glosses but it tends to be in entries such as "文化人類学/cultural anthropology" and "形質人類学/physical anthropology" which we wouldn't tag anyway.

It may be appropriate to have a misc. tag for the 季語 terms. I'd prefer something like [seas] or [season] to [kigo], but maybe the latter is OK. I'll try and push on with these new tags in the next few days,

briankrznarich commented 8 months ago

I certainly don't care what the tag is. I've just been thinking for a while it might be nice to have one. [season] would be fine with me, but on the other hand, we don't want to tag all seasonal terms as [season], [kigo] would at least force the user to ask "what is this?!" (poetic seasonal terms)

Not trying to say we need an [anth] tag, but something to consider...

Looking back, I think it was when I was looking at the "consanguinity" terms that I may searched for [anth].

For me, having [anth] or maybe [sociology], or perhaps some parent term, is a tool to segregate these as "scientific jargon" (in a way that [note] cannot do), hopefully pointing in the direction of a field-of-use. There might be a chicken & egg problem here. We can't find [anth] terms because there is no tag, we won't add a tag because there are no terms.

I just googled 人類学 + 語彙, and this came up immediately [文化人類学用語300: Glossary for cultural anthropologists ...] https://navymule9.sakura.ne.jp/0-diculanth.html

Probably most of these terms agree with your evaluation, but here are a few super-jargony ones(we don't have these): medical pluralism | 医療的多元論 pluralistic medical behavior 多元的医療行動 pluralistic medical system 多元的医療体系 https://www.anthroencyclopedia.com/entry/medical-pluralism

So, how to tag/note/gloss if we add them?

Also listed, maybe worth an [anth] tag: animism → アニミズム totemism → トーテミズム manaism → マナイズム

This is actually common(ish), from sociology symbolic interactionism 象徴的相互作用論

There are more in the link, these are just some examples.

Maybe some tag for "human sciences" encompassing anthropology and sociology, limited to terms that are either:

  1. not common use
  2. have a sense that more narrowly defined within the field than in colloquial use

I noticed we have archeology [archeol](subfield of [anth]?). It matches 19 terms, and if you look at the selection, it seems rather arbitrary, and doesn't seem to be accomplishing much of anything. I imagine this is what you are looking to avoid.

We could have a tag broadly defined as "human-sciences-jargon" encompassing anth, archeol, sociology, and anything related that might pop up.

Marcusjmdict commented 8 months ago

Kigo tags have been discussed in the past, only thing I can find in my e-mail is a proposal from Scott in 2016, but I'm pretty sure it's been discussed somewhere else more recently as well, though I can't remember the arguments against including it too well. A general [kigo] tag would be meaningless either way - if it were to serve any meaningful purpose (to hobbyist haiku writers?) I think we would have to have one tag per season, as some of the kokugos do. Essentially all common flowers, plants and animals are kigo...

On Sat, Oct 28, 2023 at 2:41 PM briankrznarich @.***> wrote:

I certainly don't care what the tag is. I've just been thinking for a while it might be nice to have one. [season] would be fine with me, but on the other hand, we don't want to tag all seasonal terms as [season], [kigo] would at least force the user to ask "what is this?!" (poetic seasonal terms)

Not trying to say we need an [anth] tag, but something to consider...

Looking back, I think it was when I was looking at the "consanguinity" terms that I may searched for [anth].

For me, having [anth] or maybe [sociology], or perhaps some parent term, is a tool to segregate these as "scientific jargon" (in a way that [note] cannot do), hopefully pointing in the direction of a field-of-use. There might be a chicken & egg problem here. We can't find [anth] terms because there is no tag, we won't add a tag because there are no terms.

I just googled 人類学 + 語彙, and this came up immediately [文化人類学用語300: Glossary for cultural anthropologists ...] https://navymule9.sakura.ne.jp/0-diculanth.html

Probably most of these terms agree with your evaluation, but here are a few super-jargony ones(we don't have these): medical pluralism | 医療的多元論 pluralistic medical behavior 多元的医療行動 pluralistic medical system 多元的医療体系 https://www.anthroencyclopedia.com/entry/medical-pluralism

So, how to tag/note/gloss if we add them?

Also listed, maybe worth an [anth] tag: animism → アニミズム totemism → トーテミズム manaism → マナイズム

This is actually common(ish), from sociology symbolic interactionism 象徴的相互作用論

There are more in the link, these are just some examples.

Maybe some tag for "human sciences" encompassing anthropology and sociology, limited to terms that are either:

  1. not common use
  2. have a sense that more narrowly defined within the field than in colloquial use

I noticed we have archeology [archeol](subfield of [anth]?). It matches 19 terms, and if you look at the selection, it seems rather arbitrary, and doesn't seem to be accomplishing much of anything. I imagine this is what you are looking to avoid.

We could have a tag broadly defined as "human-sciences-jargon" encompassing anth, archeol, sociology, and anything related that might pop up.

— Reply to this email directly, view it on GitHub https://github.com/JMdictProject/JMdictIssues/issues/98#issuecomment-1783711684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUCQIIY5E4QJZEUOWNXPLE3YBSLJLAVCNFSM6AAAAAA325DVKKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBTG4YTCNRYGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

briankrznarich commented 8 months ago

I certainly wasn't imagining scattering [kigo] across 10,000 jmdict entries when I made the suggestion, so that is a fair point. (Though this is a giant compendium of Japanese vocab knowledge, so that might be interesting to take on, if out-of-scope at this exact moment)

Maybe this is shikataganai, but my rationale for [kigo] was for identifying otherwise-rare terms as being relevant for a reason, when browsing by kanji. Existence in 3 haikus from famous authors does not make the term "not rare". But it might make the term interesting vs. 50 other obsolete terms using the same kanji. If you have an interest in haiku, you might even go look for one that contains it. For this, a single [kigo] tag alone is enough to highlight the word. Ideally the gloss would clarify the seasonal reference(vs. nikk's 季+春・夏・秋・冬, which is still imprecise, not specifying where in the season.).

My motivation goes back to a very old (for me) edit to 厩出, which is fairly [rare] in its own right, even in haiku. But I did find eventually in some haiku references with some clearer explanations than nikk was giving.

We ended up with: 厩出[rare] releasing livestock to graze in spring

My issue is that this particular word is arguably more relevant for its [kigo] use than its literal use. And I couldn't get "(used in haiku)" or "(seasonal word for the beginning of spring)" into the gloss. Despite some back and forth, the term still looks ambiguous to me (it's the release of livestock after being cooped up for the winter). Too verbose

肌寒(はださむ) vs. 肌寒い is another such term. 肌寒 is not "just" an old variant of 肌寒い。 It has a particular use, which is poetic(still otherwise [rare]). (I did get "esp. in autumn" added here)

See for some examples of 厩出 and 肌寒 in haiku: http://www.haisi.com/saijiki/umayadasi.htm http://www.haisi.com/saijiki/hadasamu.htm

If someone looks up these terms while reading haiku, it would surely be helpful if we explained "you're reading a haiku? oh, then this is actually a sign of early spring/early autumn", and not just a literal dictionary definition. With the current state of things, we're deferring to needing a secondary, poetry-specific reference book.

Browsing just haisi.com, there might be a thousand terms there or more. So you are correct, this would probably be a larger undertaking than I was thinking. http://www.haisi.com/saijiki/index.htm

I would be happy for the moment probably if there were some more flexibility on glossing [rare] kigo, but then people might want to start applying the same ad-hoc glosses to flowers and trees and birds, so I can see the issue.

briankrznarich commented 8 months ago

Looked at slightly differently, we have [geom], but we don't tag 三角形 as [geom], because it's common. If we had a [poetry] tag that was only applied to otherwise [rare] terms, that seems somewhat analogous. Think of "厩出し" as "poetry jargon" of minimal practical use outside of that field.

Or should we tag triangles as [geom]? Don't know...

Just a thought.

briankrznarich commented 8 months ago

Maybe too late. I ran into "wet paint" again. We could standardize our [note="on a sign"] entries with something like [sign]/signage.

If you google "ペンキ塗り立て" , you actually plastic warning tape for surrounding wherever you painted, not strictly a sign.

JMdictProject commented 7 months ago

I've added 11 of the possible tags discussed here. I'll close this issue now and open a new one.