Open jwage opened 6 years ago
That would be great if German is supported. It doesn't seem to be as straightforward though. Example:
Internationalisierungsstrategie
Internationalisierungsstrategien
How can we tell they are Internationalisierung
and Strategie
(Note there's a connecting s
)?
Say we can - so would it become something like:
internationalisierungsStrategie
???internationalisierungs_strategie
???Never programmed in German, quite interested to see how it works. Let me know if I can be of any help.
A few compound word examples here:
Internationalisierungsstrategie
-> /(.+)ung[s]?(.+)/
Zwischenzeit
Höchsttemperatur
Haustürschlüsselloch
Normally when we program in German how do we deal with umlauts? Use e
instead?
How can we tell they are
Internationalisierung
andStrategie
(Note there's a connectings
)?
I may be missing something, but I don't think we need to match all words. For regular words, we only care about the ending. For irregular ones, we match from the end and keep building an ever longer and longer list of irregular words as issues appear.
Normally when we program in German how do we deal with umlauts? Use
e
instead?
When people can't type them (e.g. because of restrictions in what characters are allowed), they normally use these replacements:
ä
=> ae
ö
=> oe
ü
=> ue
ß
=> ss
Some tools replace them with just the first letter of these replacements, which I believe is incorrect.
So who is going to add support for German? :) You can see the rules for another language here https://github.com/doctrine/inflector/tree/master/lib/Doctrine/Inflector/Rules/English
I would close this as impossible. I once tried as well, and after 500 lines of exceptions and still no where close to having a reliable package this is just not worth doing. Whats the purpose if still every 2nd word doesn't work properly? Inflection is not relevant here IMO, at least we have it well working for English.
Do you think it is really impossible? or just really hard with many exceptions to the rules? I would be interested still in getting a PR started and maybe we can slowly work on it over time and get contributions from people.
Knowing my German language well and looking into the issue it is really literally impossible. It starts with words, that are the same but also need to know the "der/die/das" word in front of it to clarify the male/female/neutral form and thus the meaning, and as such also different plural forms for those. And that only covers this aspect, plurals on many many forms also require this male/female/neutral pre-word to have any possibility of building rules, and that is impossible if you only have the noun. The ending itself only is not really possible to use here (like for English). So yeah it is a mess.
And if we had a public API that let you provide the word before, would it still be impossible in other cases?
You could start a library for all nouns. Together with this that would work.
Is this still relevant/wanted? I think the majority of words could be covered with a few rules. Of course, German is not an easy language so there will be quite some exceptions but they are there in English as well so I wouldn't say it's impossible. I can try to implement some of that, as far as I can get and then we see if it looks good or if there are any bigger hurdles to overcome.
Maybe I am missing something, but it doesn't look impossible to me, and I am also a native speaker.
I would still be interested in seeing support for other languages.
Ha, I indeed missed something, yes. Without the context of at least one sentence, it is not really doable without building the mentioned list of words. If we have a complete sentence, I guess it would look better, but with the word alone, we will not get very far. Sorry for digging this up needlessly :-/
Yeah, without the context (at least the "der/die/das" article) there will be no chance to translate properly. Imagine "Leiter".
die Leiter ("ladder") => die Leitern ("ladders") der Leiter ("leader") => die Leiter ("leaders")
And that is only one of a few issues to mention.
The API could be enhanced so that the context around the word can be passed through, no?
Never programmed in German, quite interested to see how it works. Let me know if I can be of any help.
You also should NEVER program in German.^^
I only see the value here in translation tooling and maybe custom routing configs and other things that have context data processing, this should never find its way into actual PHP source code class files IMO. And especially not as class names, method names, or other tokens.
The API could be enhanced so that the context around the word can be passed through, no?
Yes, sure. But the problem is: how much of the context do you want to provide? There are cases where even the full sentence around the word might not be enough. I think it is doable if we get provided with the article of the given word. At least for the most cases. Would people want to provide that?
It is funny that this is almost trivial in english, but impossible in German.
cc @dmecke