doctrine / inflector

Doctrine Inflector is a small library that can perform string manipulations with regard to uppercase/lowercase and singular/plural forms of words.
https://www.doctrine-project.org/projects/inflector.html
MIT License
11.26k stars 137 forks source link

Add support for German language #99

Open jwage opened 6 years ago

jwage commented 6 years ago

cc @dmecke

sinogermany commented 6 years ago

That would be great if German is supported. It doesn't seem to be as straightforward though. Example:

How can we tell they are Internationalisierung and Strategie (Note there's a connecting s)?

Say we can - so would it become something like:

Never programmed in German, quite interested to see how it works. Let me know if I can be of any help.

sinogermany commented 6 years ago

A few compound word examples here:

Normally when we program in German how do we deal with umlauts? Use e instead?

alcaeus commented 6 years ago

How can we tell they are Internationalisierung and Strategie (Note there's a connecting s)?

I may be missing something, but I don't think we need to match all words. For regular words, we only care about the ending. For irregular ones, we match from the end and keep building an ever longer and longer list of irregular words as issues appear.

Normally when we program in German how do we deal with umlauts? Use e instead?

When people can't type them (e.g. because of restrictions in what characters are allowed), they normally use these replacements:

Some tools replace them with just the first letter of these replacements, which I believe is incorrect.

jwage commented 6 years ago

So who is going to add support for German? :) You can see the rules for another language here https://github.com/doctrine/inflector/tree/master/lib/Doctrine/Inflector/Rules/English

dereuromark commented 6 years ago

I would close this as impossible. I once tried as well, and after 500 lines of exceptions and still no where close to having a reliable package this is just not worth doing. Whats the purpose if still every 2nd word doesn't work properly? Inflection is not relevant here IMO, at least we have it well working for English.

jwage commented 6 years ago

Do you think it is really impossible? or just really hard with many exceptions to the rules? I would be interested still in getting a PR started and maybe we can slowly work on it over time and get contributions from people.

dereuromark commented 6 years ago

Knowing my German language well and looking into the issue it is really literally impossible. It starts with words, that are the same but also need to know the "der/die/das" word in front of it to clarify the male/female/neutral form and thus the meaning, and as such also different plural forms for those. And that only covers this aspect, plurals on many many forms also require this male/female/neutral pre-word to have any possibility of building rules, and that is impossible if you only have the noun. The ending itself only is not really possible to use here (like for English). So yeah it is a mess.

jwage commented 6 years ago

And if we had a public API that let you provide the word before, would it still be impossible in other cases?

dereuromark commented 6 years ago

You could start a library for all nouns. Together with this that would work.

nschoellhorn commented 5 years ago

Is this still relevant/wanted? I think the majority of words could be covered with a few rules. Of course, German is not an easy language so there will be quite some exceptions but they are there in English as well so I wouldn't say it's impossible. I can try to implement some of that, as far as I can get and then we see if it looks good or if there are any bigger hurdles to overcome.

Maybe I am missing something, but it doesn't look impossible to me, and I am also a native speaker.

jwage commented 5 years ago

I would still be interested in seeing support for other languages.

nschoellhorn commented 5 years ago

Ha, I indeed missed something, yes. Without the context of at least one sentence, it is not really doable without building the mentioned list of words. If we have a complete sentence, I guess it would look better, but with the word alone, we will not get very far. Sorry for digging this up needlessly :-/

dereuromark commented 5 years ago

Yeah, without the context (at least the "der/die/das" article) there will be no chance to translate properly. Imagine "Leiter".

die Leiter ("ladder") => die Leitern ("ladders") der Leiter ("leader") => die Leiter ("leaders")

And that is only one of a few issues to mention.

jwage commented 5 years ago

The API could be enhanced so that the context around the word can be passed through, no?

dereuromark commented 5 years ago

Never programmed in German, quite interested to see how it works. Let me know if I can be of any help.

You also should NEVER program in German.^^

I only see the value here in translation tooling and maybe custom routing configs and other things that have context data processing, this should never find its way into actual PHP source code class files IMO. And especially not as class names, method names, or other tokens.

nschoellhorn commented 5 years ago

The API could be enhanced so that the context around the word can be passed through, no?

Yes, sure. But the problem is: how much of the context do you want to provide? There are cases where even the full sentence around the word might not be enough. I think it is doable if we get provided with the article of the given word. At least for the most cases. Would people want to provide that?

grafst commented 1 year ago

It is funny that this is almost trivial in english, but impossible in German.