Closed navruzm closed 11 years ago
Hi, thanks for this pull request. I would replace your line of code with this one:
if (false !== strpos($s, 'ı')) $s = str_replace('ı', 'i', $s);
the small dotless i is listed in: http://unicode.org/repos/cldr/trunk/common/transforms/Latin-ASCII.xml but this is not the data source that is used by iconv.
We should compare the mapping in this XML files with the one done by iconv and see if other character differs...
Here are all the characters that are mapped to something based on Latin-ASCII.xml, but are not mapped by iconv on ubuntu (the dotless small i is in the list).
Not sure about what do to with this list now...
Ð ? D
Ø ? O
Þ ? TH
ð ? d
ø ? o
þ ? th
Đ ? D
đ ? d
Ħ ? H
ħ ? h
ı ? i
ĸ ? q
Ŋ ? N
ŋ ? n
Ŧ ? T
ŧ ? t
ƀ ? b
Ɓ ? B
Ƃ ? B
ƃ ? b
Ƈ ? C
ƈ ? c
Ɖ ? D
Ɗ ? D
Ƌ ? D
ƌ ? d
Ɛ ? E
Ƒ ? F
ƒ ? f
Ɠ ? G
ƕ ? hv
Ɩ ? I
Ɨ ? I
Ƙ ? K
ƙ ? k
ƚ ? l
Ɲ ? N
ƞ ? n
Ƣ ? OI
ƣ ? oi
Ƥ ? P
ƥ ? p
ƫ ? t
Ƭ ? T
ƭ ? t
Ʈ ? T
Ʋ ? V
Ƴ ? Y
ƴ ? y
Ƶ ? Z
ƶ ? z
Ǥ ? G
ǥ ? g
ȡ ? d
Ȥ ? Z
ȥ ? z
ȴ ? l
ȵ ? n
ȶ ? t
ȷ ? j
ȸ ? db
ȹ ? qp
Ⱥ ? A
Ȼ ? C
ȼ ? c
Ƚ ? L
Ⱦ ? T
ȿ ? s
ɀ ? z
Ƀ ? B
Ʉ ? U
Ɇ ? E
ɇ ? e
Ɉ ? J
ɉ ? j
Ɍ ? R
ɍ ? r
Ɏ ? Y
ɏ ? y
ɓ ? b
ɕ ? c
ɖ ? d
ɗ ? d
ɛ ? e
ɟ ? j
ɠ ? g
ɡ ? g
ɢ ? G
ɦ ? h
ɧ ? h
ɨ ? i
ɪ ? I
ɫ ? l
ɬ ? l
ɭ ? l
ɱ ? m
ɲ ? n
ɳ ? n
ɴ ? N
ɶ ? OE
ɼ ? r
ɽ ? r
ɾ ? r
ʀ ? R
ʂ ? s
ʈ ? t
ʉ ? u
ʋ ? v
ʏ ? Y
ʐ ? z
ʑ ? z
ʙ ? B
ʛ ? G
ʜ ? H
ʝ ? j
ʟ ? L
ʠ ? q
ʣ ? dz
ʥ ? dz
ʦ ? ts
ʪ ? ls
ʫ ? lz
ᴀ ? A
ᴁ ? AE
ᴃ ? B
ᴄ ? C
ᴅ ? D
ᴆ ? D
ᴇ ? E
ᴊ ? J
ᴋ ? K
ᴌ ? L
ᴍ ? M
ᴏ ? O
ᴘ ? P
ᴛ ? T
ᴜ ? U
ᴠ ? V
ᴡ ? W
ᴢ ? Z
ᵫ ? ue
ᵬ ? b
ᵭ ? d
ᵮ ? f
ᵯ ? m
ᵰ ? n
ᵱ ? p
ᵲ ? r
ᵳ ? r
ᵴ ? s
ᵵ ? t
ᵶ ? z
ᵺ ? th
ᵻ ? I
ᵽ ? p
ᵾ ? U
ᶀ ? b
ᶁ ? d
ᶂ ? f
ᶃ ? g
ᶄ ? k
ᶅ ? l
ᶆ ? m
ᶇ ? n
ᶈ ? p
ᶉ ? r
ᶊ ? s
ᶌ ? v
ᶍ ? x
ᶎ ? z
ᶏ ? a
ᶑ ? d
ᶒ ? e
ᶓ ? e
ᶖ ? i
ᶙ ? u
ẜ ? s
ẝ ? s
ẞ ? SS
Ỻ ? LL
ỻ ? ll
Ỽ ? V
ỽ ? v
Ỿ ? Y
ỿ ? y
₠ ? CE
₢ ? Cr
₣ ? Fr.
₤ ? L.
₧ ? Pts
₹ ? Rs
℞ ? Rx
〇 ? 0
′ ? '
〝 ? "
〞 ? "
‖ ? ||
⁅ ? [
⁆ ? ]
⁎ ? *
、 ? ,
。 ? .
〈 ? <
〉 ? >
《 ? <<
》 ? >>
〔 ? [
〕 ? ]
〘 ? [
〙 ? ]
〚 ? [
〛 ? ]
︑ ? ,
︒ ? .
︹ ? [
︺ ? ]
︽ ? <<
︾ ? >>
︿ ? <
﹀ ? >
÷ ? /
∥ ? ||
⦅ ? ((
⦆ ? ))
Thanks
Currently when you have an "ı" (small dotless i) in your string, Utf8::toAscii doesn't convert properly this character. "ı" converted to "?" instead of "i".
This is a known Unicode CLDR bug opened 3 years ago and does not seem to be fixed. They said "it should be fixed next release, in bug #3335" which opened 2 years ago. I don't know when they fix that, but that bug causes problem like this one laravel/framework#552