Closed albertlotw closed 1 year ago
Compiling and debugging by myself. I found a problem in function _UnicodeURLDecode()
. In my system, the function transforms some '%xx' into question mark characters '?' rather than corresponding ASCII character when 'xx' is greater than 0x80. The _UnicodeURLDecode()
uses chr()
+ StringToBinary()
to generate bytes. A detail investigation found chr()
function not able to generate all ASCII characters on specific language setting of Windows (ex. Traditional Chinese, code page 950). I just checked autoit-v3 statement $asc=asc(chr($c))
. The resulting value $asc
would be always 63(0x3F), which is the ASCII code of question mark, when $c
is ranging from 0x81 to 0xFE. I've check the same code in Windows 10 with English, the issue does not occur, where the result value of $asc
would be always the same as $c
.
I have tried to fix the issue and the following code works for me. Instead of chr()
+ StringToBinary()
, the modified code uses binary()
to generate binary codes. The StringReplace()
call shall be okay doing after decoding UTF8.
Func _UnicodeURLDecode($toDecode)
Local $strChar = "", $iOne, $iTwo
Local $aryHex = StringSplit($toDecode, "")
For $i = 1 To $aryHex[0]
If $aryHex[$i] = "%" Then
$i += 1
$iOne = $aryHex[$i]
$i += 1
$iTwo = $aryHex[$i]
$strChar = $strChar & $iOne & $iTwo
Else
$strChar = $strChar & StringRight(Hex(Asc($aryHex[$i])),2)
EndIf
Next
Local $Process = Binary("0x" & $strChar)
Local $DecodedString = BinaryToString($Process, 4)
Return StringReplace($DecodedString, "+", " ")
EndFunc ;==>_UnicodeURLDecode
Running with modified code, I get a correct result. The screenshot:
I have tried to fix the issue and the following code works for me. Instead of
chr()
+StringToBinary()
, the modified code usesbinary()
to generate binary codes. TheStringReplace()
call shall be okay doing after decoding UTF8.Func _UnicodeURLDecode($toDecode) Local $strChar = "", $iOne, $iTwo Local $aryHex = StringSplit($toDecode, "") For $i = 1 To $aryHex[0] If $aryHex[$i] = "%" Then $i += 1 $iOne = $aryHex[$i] $i += 1 $iTwo = $aryHex[$i] $strChar = $strChar & $iOne & $iTwo Else $strChar = $strChar & StringRight(Hex(Asc($aryHex[$i])),2) EndIf Next Local $Process = Binary("0x" & $strChar) Local $DecodedString = BinaryToString($Process, 4) Return StringReplace($DecodedString, "+", " ") EndFunc ;==>_UnicodeURLDecode
I'd double check this code and merge it in if everything looks good.
Please try the following code on your end. It should be a cleaner solution and is working on my end.
Func _UnicodeURLDecode($sData)
Local $aData = StringSplit(StringReplace($sData,"+"," ",0,1),"%")
$sData = ""
For $i = 2 To $aData[0]
$aData[1] &= Chr(Dec(StringLeft($aData[$i],2))) & StringTrimLeft($aData[$i],2)
Next
Return BinaryToString(StringToBinary($aData[1],1),4)
EndFunc
Well since the above code still produce a binary string via Chr()
+ StringToBinary()
, it still does not work because Chr($c)
produces a question mark character whenever 0x81<$c
<0xFE on my side. I think we have to achieve the function without usage of Chr()
.
The testing code:
$url = "https://www.bing.com/search?q=%E6%9B%B4%E6%94%B9%E6%96%87%E5%AD%97%E5%A4%A7%E5%B0%8F%20windows%2010&form=B00032&ocid=SettingsHAQ-BingIA&mkt=zh-TW"
ConsoleWrite(_UnicodeURLDecode($url));
The result:
https://www.bing.com/search?q=?????????????????? windows 10&form=B00032&ocid=SettingsHAQ-BingIA&mkt=zh-TW
Well since the above code still produce a binary string via
Chr()
+StringToBinary()
, it still does not work becauseChr($c)
produces a question mark character whenever 0x81<$c
<0xFE on my side. I think we have to achieve the function without usage ofChr()
.The testing code:
$url = "https://www.bing.com/search?q=%E6%9B%B4%E6%94%B9%E6%96%87%E5%AD%97%E5%A4%A7%E5%B0%8F%20windows%2010&form=B00032&ocid=SettingsHAQ-BingIA&mkt=zh-TW" ConsoleWrite(_UnicodeURLDecode($url));
The result:
https://www.bing.com/search?q=?????????????????? windows 10&form=B00032&ocid=SettingsHAQ-BingIA&mkt=zh-TW
Whoops. Please replace Chr
with ChrW
Also note that ConsoleWrite may have encoding issues because reasons.
Whoops. Please replace Chr with ChrW
I think you mean $aData[1] &= ChrW(Dec(StringLeft($aData[$i],2))) & StringTrimLeft($aData[$i],2)
The result a little bit weird. I have not checked ChrW()
behavior on my side.
https://www.bing.com/search?q=a???a?1a??a-?a???a?X? windows 10&form=B00032&ocid=SettingsHAQ-BingIA&mkt=zh-TW
Also note that ConsoleWrite may have encoding issues because reasons.
Yes I also checked ConsoleWrite
on output of _UnicodeURLDecode
that uses Binary()
and the console output is correct.
The following seems okay on my system. But I don't think it is more readable than the original one.
Func _UnicodeURLDecode($sData)
Local $aData = StringSplit(StringReplace($sData,"+"," ",0,1),"%")
$aData[1] = Binary($aData[1])
For $i = 2 To $aData[0]
$aData[1] &= StringLeft($aData[$i],2) & StringTrimLeft(Binary(StringTrimLeft($aData[$i],2)),2)
Next
Return BinaryToString($aData[1],4)
EndFunc
I've checked the behavior of ChrW()
on my environment. The following code
$c = Chr(0x81)
$cw = ChrW(0x81)
ConsoleWrite("Character Code $c: 0x" & Hex(Asc($c)) & @CRLF)
ConsoleWrite("Character Code $cw: 0x" & Hex(AscW($cw)) & @CRLF)
ConsoleWrite("$cw After StringToBinary : " & StringToBinary($cw) & @CRLF)
gets the output
Character Code $c: 0x0000003F
Character Code $cw: 0x00000081
$cw After StringToBinary : 0x3F
In contrary to Chr()
where I get question mark directory on its return, ChrW()
could generate a character code greater than 0x80 in my environment, but it still becomes question mark after StringToBinary()
.
ChrW()
returns a wide character string and StringToBinary($cw)
converts a wide character string into an ANSI encoded string. The ANSI string is code page dependent. In my system (code page 950) does not define the character corresponding to U+0081
. So I get question mark after StringToBinary()
.
I've swapped to using WinAPI functions for this. Please try the latest test build:
https://github.com/rcmaehl/MSEdgeRedirect/suites/12006614059/artifacts/630536670
https://github.com/rcmaehl/MSEdgeRedirect/suites/12006614059/artifacts/630536670
Yes, this one is working for me. Thanks.
https://github.com/rcmaehl/MSEdgeRedirect/suites/12006614059/artifacts/630536670
Yes, this one is working for me. Thanks.
Great!
Preflight Checklist
Install Type
New Deployment (Chocolatey, Winget, Etc)
Install Mode
Active Mode
Steps to reproduce
Clicking a topic link in Windows 10 Start->Setting that opens a UTF8 encoded URL with Traditional Chinese characters. Here is an excerpt from the
AppGeneral.log
✔️ Expected Behavior
The opened URL to be "https://www.bing.com/search?q=%E6%9B%B4%E6%94%B9%E6%96%87%E5%AD%97%E5%A4%A7%E5%B0%8F%20windows%2010&form=B00032&ocid=SettingsHAQ-BingIA&mkt=zh-TW"
❌ Actual Behavior
This opened page is "https://www.bing.com/search?q=??????????????????%20windows%2010&form=B00032&ocid=SettingsHAQ-BingIA&mkt=zh-TW"![圖片](https://user-images.githubusercontent.com/40633785/227613898-1f991d5b-79b0-437d-a767-5436c3c4428c.png)
Microsoft Windows version
"22H2 build 19045.2728 Traditional Chinese"
Other Software
No response