JMdictProject / JMdictIssues

JMdict Japanese dictionary - lexicographic, etc. issues management
16 stars 1 forks source link

Allowing inclusion of common road signage / placards as idiomatic #92

Open briankrznarich opened 1 year ago

briankrznarich commented 1 year ago

A month ago I proposed an entry I encountered in the wild(well, in the wild on the internet): 熊出没注意 = Watch out for bears , Beware of bears (30,000 n-grams) http://www.edrdg.org/jmdictdb/cgi-bin/entr.py?svc=jmdict&sid=&e=2224480

It was rejected as "not the kind of thing we'd usually include", and "probably a useful example sentence", which I accepted at the time. But on further thought, this doesn't seem quite right to me.

jmdictdb includes a fair number of idiomatic expressions(currently 1138 [id] entries). Here's a nice example(from page #1): 五十歩百歩 ごじっぽひゃっぽ six of one, half a dozen of the other; scant difference

Idiomatic entries are of value not just because their meanings may not be immediately apparent, but also because they are formulaic. You don't piecemeal/ad-hoc translate an idiomatic expression. You convert it to the appropriate idiomatic expression in the target language(if one is available). The fact that an idiom was used in the first place is something you try to preserve in translation. (Making "six of one, half a dozen of the other" better than "scant difference", I think, so long as your audience will know the idiom).

Signage is notoriously hard to translate correctly(We spent two weeks just on signs in a French->English translation course I took back in the day). Creativity and talent do not apply; you basically have to have encountered the sign in both languages to know the correct mapping. And for the Japanese/English pair, these mappings seems like quite a challenge to come by anywhere online.

Why wouldn't jmdictdb, a tool for translating/glossing Japanese into English, be a good place to offer these pairings? ("Construction Ahead", "Slippery when wet", "Beware of Dog", ...).

One of the reasons "Engrish" shows up on Japanese signage is because there are limited resources available to help translate common expressions(which don't appear in Japanese dictionaries, and thus don't get the "free pass" into jmdict). So, we get Japanese signs like this(this contains a great photo):

THIS AREA IS INFESTED BY BEAR (北海道開発局) https://engrish.com/2006/08/big-infestation/

Here's a less-bad-but-still-not-great Japanese reference attempt: https://www.waeijisho.net/word.html?id=37015 表記:熊出没注意 bear warning

I mean sure, I guess...

It has become clear to me that jmdictdb is used as a source for a lot of J->E tools used by Japanese speakers, and it often provides answers that are not available in other references. If "熊出没注意" were in jmdictdb, for example, any search for "熊出没注意 英語" would return the entry as a top result.

Yes, looking at our [id] entries I can see that they are mostly of the "meaning-would-be-difficult-to-deduce" variety. Even so, allowing idiomatic signage phrases seems like "all upside" to me. I don't imagine a "gold rush" of new entries. I wouldn't be embarking on a personal campaign to identify every road sign I see. But if someone is willing to put in the effort to determine a mapping, why reject it?

That's all, thanks for your thoughts and consideration.

JMdictProject commented 1 year ago

We actually have quite a few of these as entries: 立入禁止, 関係者以外立入禁止, 入場お断り, 通行禁止, 帯出禁止, 面会謝絶, etc. most of which are warnings (as was 熊出没注意).

I agree with Brian that it might be useful to include more of these when they occur often enough to notice. Perhaps we could add a [sign] field tag to make their context clear.

Marcusjmdict commented 1 year ago

I've added a lot of entries myself and I agree that common ones are useful to have. I don't think this is really a policy issue but a case-by-case thing, and that it would be better to just raise this as an edit on the deleted entry. (we used to "reject" entries in the past but changed to deleting them instead so that they can be re-opened if there are any objections).

I'll respond there, in the database, instead. I propose we close this issue.

Marcus

On Tue, Apr 4, 2023, 9:02 AM JMdictProject @.***> wrote:

We actually have quite a few of these as entries: 立入禁止, 関係者以外立入禁止, 入場お断り, 通行禁止, 帯出禁止, 面会謝絶, etc. most of which are warnings (as was 熊出没注意).

I agree with Brian that it might be useful to include more of these when they occur often enough to notice. Perhaps we could add a [sign] field tag to make their context clear.

— Reply to this email directly, view it on GitHub https://github.com/JMdictProject/JMdictIssues/issues/92#issuecomment-1495147001, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUCQII3OW3N634JSUUZEAV3W7NQKFANCNFSM6AAAAAAWRA6UYU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

briankrznarich commented 1 year ago

[EDIT]It's not so important anymore to read what I had written here. Too long-winded. Essentially, I'd like some guidance on rules for inclusion of road signage, especially with regards to our usual "obviousness" criteria(which I think should be lowered in the case of idiomatic signage). I ended up with some examples for discussion in my next post, so if anyone thinks it's worth discussing, that would be somewhere to start.

briankrznarich commented 1 year ago

I just took a look at tatoeba for some very basic signage phrases… It seems to have, at present, pretty sparse coverage, and the coverage that is there is often incomplete or misleading. (This means that any entry added there must be viewed with the same suspicion as entries already present). Not that this is jmdict’s problem, it’s just something to keep in mind.

To give a few other concrete examples for discussion…

足元注意 “Watch your step” (You also hear this every time you get off the train in Tokyo, 足元にご注意ください). We have 足元, and we know why it’s used to the point that our glosses include: 足元: one's step (as in "watch your step")

Why not just explicitly add: 足元注意 (3700 ngrams, NOT IN tatoeba) and/or 足元にご注意ください (2600 ngrams, in tatoeba)?

I’ve been aware of this next missing entry for more than 5 years (saw it drawn on some pillars an anime somewhere). 頭上注意: “Watch your head” (4071 ngrams, NOT IN tatoeba) (Here's it's the English that's particularly idiomatic. Glosses not quite as helpful here: overhead; above one's head; high in the sky)

As well-know as 頭上注意 should be by now, I still see funny translations of this in the subway. (I’m not under the illusion that adding this to jmdict will solve the problem, but you can only do what you can do…)

On the other hand, to maybe see Marcus’s angle, below are some uncharacteristically well-translated rules I found at the entrance of a large city park this morning(after the vet, walking the dog). These are maybe more “complete sentence” in nature, and there are probably many natural variations to a lot of these that might make them less practical in jmdict(and maybe a better fit for tatoeba). But for the sake of discussion:

スケートボード等はご遠慮ください No Skateboarding 許可なく物品販売、広告宣伝はできません No selling or advertising without permission テントやタープを立てないでください No Tents or Tarps リードをつけましょう Please keep your dog on a leash セグウェイ等の電動の乗り物は使用できません No Segways or Motorized Scooters 宴会はご遠慮ください Parties Prohibited ドローン・ラジコン等の使用はできません No Drones or Remote Controlled Aircraft 火気の使用はできません No Flames or Sparks 野鳥のエサを与えないでください Do not feed wild birds バイクの乗り入れ禁止 No Motorcycle Riding

Ngrams バイクの乗り入れ931 野鳥にエサ 518 許可なく物品販売 67 許可なく広告宣伝 72 物品販売 59024 広告宣伝 207135 (we have 広告宣伝車, but not 広告宣伝 “Advertising”?) 広告宣伝車 51 宴会はご遠慮 60 リードをつけましょう 156 火気の使用 9404 火気を使用しない 8268 火気厳禁 19719 ("no open flame" maybe a good candidate. no matches in tatoeba. No matches for "火気" in tatoeba.)