Closed szepeviktor closed 8 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
358b2bd
) 100.00% compared to head (cf6dbf6
) 100.00%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@PrinsFrank This seems impossible to solve!
case eu = 'eu';
+ case eu = 'ею';
+ case eu = 'ευ';
What to do?? eu2, eu3 ... ?
🇮🇳
@szepeviktor what about detecting what script the initial text was in and appending that to the value? something like eu_greek
, org_cyrillic
, and if the script was latin don't append anything? I don't know how to detect scripts with PHP yet though.
detecting what script the initial text was
The only thing I can think of is looping through all ScriptCode
-s and try matching ^\p{sc=Hira}+$
https://www.unicode.org/reports/tr18/#Script_Property
@szepeviktor That's quite smart actually! If that is reliable it would be cool to add that the the transliteration package!
all non-Latin CountryCodeTLD
cases
alardn seems to be a Arabic
albhryn seems to be a Arabic
aljzayr seems to be a Arabic
almghrb seems to be a Arabic
alswdyt seems to be a Arabic
amarat seems to be a Arabic
ao_men seems to be a Han
art seems to be a Arabic
ayran seems to be a Arabic
banla seems to be a Bengali
bart seems to be a Arabic
bel seems to be a Cyrillic
bharat seems to be a Telugu
bharata seems to be a Kannada
bharatam seems to be a Devanagari
bharota seems to be a Devanagari
cinkappur seems to be a Tamil
el seems to be a Greek
flstyn seems to be a Arabic
hangug seems to be a Hangul
hay seems to be a Armenian
ilankai seems to be a Tamil
intiya seems to be a Tamil
kaz seems to be a Cyrillic
laav seems to be a Lao
lanka seems to be a Sinhala
man seems to be a Arabic
mkd seems to be a Cyrillic
mlysya seems to be a Arabic
mon seems to be a Cyrillic
msr seems to be a Arabic
mwrytanya seems to be a Arabic
pakstan seems to be a Arabic
qtr seems to be a Arabic
raq seems to be a Arabic
rf seems to be a Cyrillic
srb seems to be a Cyrillic
swdan seems to be a Arabic
swryt seems to be a Arabic
tai_wan seems to be a Han
thiy seems to be a Thai
twns seems to be a Arabic
ukr seems to be a Cyrillic
xiang_gang seems to be a Han
xin_jia_po seems to be a Han
ysr_l seems to be a Hebrew
zhong_guo seems to be a Han
PHP supports these scripts: https://www.php.net/manual/en/regexp.reference.unicode.php
not all our ScriptCode
-s
foreach ($scripts as $script) {
if (preg_match('/^\p{' . $script . '}+$/u', $string) === 1) {
return $script;
}
}
return '?';
Please write the actual code. I don't know the where and the how.
Now we have
for PHP-supported scripts!
I'm searching for the where but EnumCase does not know about other cases and EnumFile cannot change the name in EnumCase (because it is readonly).
I think we should never drop a duplicate Enum case but throw an exception. I hope there is no data source that contains duplicate rows!
Let's see whether there are TWO eu_greek
-s.
fixed in #200. Thanks!
From https://github.com/PrinsFrank/standards/pull/183#issuecomment-1918910818