risoflora / brookframework

Microframework which helps to develop web Pascal applications.
https://risoflora.github.io/brookframework
GNU Lesser General Public License v2.1
233 stars 54 forks source link

M.ToCString() has a bug #23

Closed Beanv587 closed 3 years ago

Beanv587 commented 3 years ago

If the parameter is a double-byte string(E.g. Chinese), the function will return a garbled string.

Steps to reproduce:

TestCode:

procedure THTTPServer.DoRequest(ASender: TObject; ARequest: TBrookHTTPRequest; AResponse: TBrookHTTPResponse); begin //Path contains Chinese AResponse.Download('G:\编程\TestFiles\1.txt'); end;

Open the browser and it will prompt File not found, click the breakpoint on the function sg_httpres_download, you will find that the variable filename is garbled

1

2

Environment:

In fact, PAnsiChar(AnsiString(AFileName)) can be used to solve this problem.

procedure TBrookHTTPResponse.Download(const AFileName: TFileName; AStatus: Word); var M: TMarshaller; R: cint; begin SgLib.Check; if FCompressed then begin R := sg_httpres_zdownload(FHandle, PAnsiChar(AnsiString(AFileName)), AStatus); CheckZLib(R); end else R := sg_httpres_download(FHandle, PAnsiChar(AnsiString(AFileName)), AStatus); CheckAlreadySent(R); if R = ENOENT then raise EFileNotFoundException.Create(SFileNotFound); SgLib.CheckLastError(R); end;

KaitoNakamura commented 3 years ago

AnsiChar and AnsiString and PAnsiChar and PAnsiString are not available in Delphi on Linux, so silvio uses his own TMarshaller for Delphi and FPC. You can only use PUTF8Char, it is the same as PANSIChar (in Linux it is mapped : UTF8Char = _AnsiChar; PUTF8Char = _PAnsiChar;)

@silvioprog : you use "AsAnsi" from the SysUtils and the return is a pointer to an ANSIString and not to an ANSIChar !!!!

function {$IFDEF FPC}TMarshaller{$ELSE}TMarshallerHelper{$ENDIF}.ToCString( const S: string): MarshaledAString; begin Result := {$IFDEF FPC} MarshaledAString(S) {$ELSE} AsAnsi(S, CP_UTF8).ToPointer {$ENDIF}; end;

Beanv587 commented 3 years ago

AnsiChar and AnsiString and PAnsiChar and PAnsiString are not available in Delphi on Linux, so silvio uses his own TMarshaller for Delphi and FPC. You can only use PUTF8Char, it is the same as PANSIChar (in Linux it is mapped : UTF8Char = _AnsiChar; PUTF8Char = _PAnsiChar;)

@silvioprog : you use "AsAnsi" from the SysUtils and the return is a pointer to an ANSIString and not to an ANSIChar !!!!

function {$IFDEF FPC}TMarshaller{$ELSE}TMarshallerHelper{$ENDIF}.ToCString( const S: string): MarshaledAString; begin Result := {$IFDEF FPC} MarshaledAString(S) {$ELSE} AsAnsi(S, CP_UTF8).ToPointer {$ENDIF}; end;

OK,thank you very much

silvioprog commented 3 years ago

@Beanv587, thanks for reporting! @KaitoNakamura, thanks for pointing.

So, this week I'm in a high demand at work. Could you provide a PR to fix the problem?

silvioprog commented 3 years ago

I forgot to mention good news. Now there is a Telegram group for helping Brook users. The link was referenced at https://github.com/risoflora/brookframework/issues/24 and is going to be published in the README file.

KaitoNakamura commented 3 years ago

hi @silvioprog I would personally do it this way (To remain compliant) : MarshaledAString(UTF8String(S)) or this way : PUTF8Char(UTF8String(S))

Both ways return a PANSIChar. In Delphi you only have to convert the non ANSI string.... And you don't get any problem on the Linux side or with ARC.

@Beanv587 : Can you change in the file "Marshalling.pas" and there in the function in the line "111" this "AsAnsi(S, CP_UTF8).ToPointer" to "MarshaledAString(UTF8String(S))" and test it?

Beanv587 commented 3 years ago

hi @silvioprog I would personally do it this way (To remain compliant) : MarshaledAString(UTF8String(S)) or this way : PUTF8Char(UTF8String(S))

Both ways return a PANSIChar. In Delphi you only have to convert the non ANSI string.... And you don't get any problem on the Linux side or with ARC.

@Beanv587 : Can you change in the file "Marshalling.pas" and there in the function in the line "111" this "AsAnsi(S, CP_UTF8).ToPointer" to "MarshaledAString(UTF8String(S))" and test it?

1

I think it would be better to distinguish between Linux and Windows by compiling options. Windows uses PAnsiChar and AnsiString to convert String

KaitoNakamura commented 3 years ago

This has nothing to do with the OS itself. The problem will be the code pages.

@silvioprog : Ignore the above, your version is so far correct, the CodePage at the end is the same. maybe you could try to omit the CP_UTF8. So just like this : AsAnsi(S).ToPointer. I can't test it here.

silvioprog commented 3 years ago

@KaitoNakamura:

Ignore the above, your version is so far correct, the CodePage at the end is the same. maybe you could try to omit the CP_UTF8. So just like this : AsAnsi(S).ToPointer. I can't test it here.

No problem dude. 🙂

@Beanv587:

Tonight I'll try to debug the code as you suggested. But, just to check if I understood properly, the string passed to the library API is:

G:\编程\TestFiles\1.txt

but it is passing:

G:\编程▼\TestFiles\1.txt

however, it should pass:

G:\编程\TestFiles\1.txt

Do mean this?

Beanv587 commented 3 years ago

@KaitoNakamura:

Ignore the above, your version is so far correct, the CodePage at the end is the same. maybe you could try to omit the CP_UTF8. So just like this : AsAnsi(S).ToPointer. I can't test it here.

No problem dude. 🙂

@Beanv587:

Tonight I'll try to debug the code as you suggested. But, just to check if I understood properly, the string passed to the library API is:

G:\编程\TestFiles\1.txt

but it is passing:

G:\编程▼\TestFiles\1.txt

however, it should pass:

G:\编程\TestFiles\1.txt

Do mean this?

OK,Thank you

you understand it right.

I tested AResponse.Send with ContentType text/plain; charset=utf-8,The output is correct. So I was wondering if it can be solved by adding utf8 string support in the function sg_httpres_zdownload?

silvioprog commented 3 years ago

you understand it right.

🎉

I tested AResponse.Send with ContentType text/plain; charset=utf-8,The output is correct. So I was wondering if it can be solved by adding utf8 string support in the function sg_httpres_zdownload?

The strings passed to the Sagui library API are in C-style strings, i.e they are null-terminated, so I suspect it is related to the string length... probably the null-terminator (\0) is missing or it is in a wrong position. Another info, all the strings in Sagui and Brook are (or should be) in UTF-8.

(I haven't debugged it yet, so I'm not sure about this above)

silvioprog commented 3 years ago

Hi guys, some progress in this bug! 🙌

It seems the problem is related to Sagui library. I have been trying to debug it on Windows, and noticed the example doesn't work with double-byte strings. However, on Linux, it worked properly.

I'm going to try to solve it and back with some news as soon as possible ...

Beanv587 commented 3 years ago

ry. I have been tryin

Yes, I tested sg_httpres_sendbinary does not have this bug. But sg_httpres_download does.

silvioprog commented 3 years ago

Related to https://github.com/risoflora/libsagui/issues/50

Beanv587 commented 3 years ago

Related to risoflora/libsagui#50

OK,Thank you

Beanv587 commented 3 years ago

@silvioprog Another question, how can I get the domain name or IP in the request?I didn't find it in the doc

silvioprog commented 3 years ago

Hi @Beanv587, about bug, could test it again using the latest Sagui version (its download function was fixed to allow Unicode strings)? If so, please close if it works fine now. 🙂

Regarding the domain, you should provide it by your DNS, since Brook just points its host to address 0.0.0.0.

Beanv587 commented 3 years ago

Hi @Beanv587, about bug, could test it again using the latest Sagui version (its download function was fixed to allow Unicode strings)? If so, please close if it works fine now. 🙂

Regarding the domain, you should provide it by your DNS, since Brook just points its host to address 0.0.0.0.

Okay, I will test it now.

Thank you

Beanv587 commented 3 years ago

Hi @Beanv587, about bug, could test it again using the latest Sagui version (its download function was fixed to allow Unicode strings)? If so, please close if it works fine now. 🙂

Regarding the domain, you should provide it by your DNS, since Brook just points its host to address 0.0.0.0.

It's works fine now, thank you