Open Molochnik opened 10 months ago
This is not really a bug, IMHO.
In non-Unicode builds, IOHandler.Write()
takes in an AnsiString
, where the optional ASrcEncoding
parameter specifies the encoding of that AnsiString
. By default, Indy expects that AnsiString
to use the OS's system encoding, which is how Delphi and FreePascal normally handle AnsiString
.
The AnsiString
data will be converted to Unicode using the ASrcEncoding
, and then that Unicode will be converted to the AByteEncoding
for transmission.
The TIdHTTPResponseInfo.Charset
property is meant to specify the output encoding on the wire, not the input encoding of the AnsiString
. This allows users to send, for example, a Latin-1 input string as UTF-8 output without having to encode it themselves, keeping the AnsiString
is the compiler's native format.
If you want to send a ContentText
string using an AnsiString
with a different encoding than the OS, you should set the IOHandler.DefAnsiEncoding
property, which the IOHandler
will use when no ASrcEncoding
is specified. For instance, if you have FreePascal/Lazarus setup to use UTF-8 encoded AnsiString
s, you can set the IOHandler.DefAnsiEncoding
to IndyTextEncoding_UTF8
, such as in the server's OnConnect
event.
Thank you very much! That helped. Some notes: 1) FConnection is protected in TIdHTTPResponseInfo and I had to write a helper:
procedure TIdHTTPResponseInfoHelper.WriteContentUni;
begin
{$IFDEF STRING_IS_ANSI}
FConnection.IOHandler.DefAnsiEncoding := IndyTextEncoding_UTF8;
{$ENDIF}
WriteContent;
end;
2) In Lazarus under Linux I have both STRING_IS_UNICODE and STRING_IS_ANSI undefined. How is that possible?
- FConnection is protected in TIdHTTPResponseInfo and I had to write a helper:
Why not simply set the DefAnsiEncoding
in the OnConnect
event, like I suggested?
procedure TMyForm.IdHTTPServer1Connect(AContext: TIdContext);
begin
{$IFDEF STRING_IS_ANSI}
AContext.Connection.IOHandler.DefAnsiEncoding := IndyTextEncoding_UTF8;
{$ENDIF}
end;
If you are going to write a custom wrapper, I would suggest allowing the caller configure the encoding as needed, eg:
procedure TIdHTTPResponseInfoHelper.WriteContentUni(
{$IFDEF STRING_IS_ANSI}ASrcEncoding: IIdTextEncoding = nil{$ENDIF}
);
begin
{$IFDEF STRING_IS_ANSI}
EnsureEncoding(ASrcEncoding, encUTF8);
FConnection.IOHandler.DefAnsiEncoding := ASrcEncoding;
{$ENDIF}
WriteContent;
end;
- In Lazarus under Linux I have both STRING_IS_UNICODE and STRING_IS_ANSI undefined. How is that possible?
Indy defines the STRING_IS_...
conditionals in its IdCompilerDefines.inc
file, did you {$I}
that file into your code?
Otherwise, Indy uses FreePascal's own FPC_UNICODESTRINGS
conditional to determine when to define STRING_IS_UNICODE
, so you could use that same conditional if you don't need compatibility with Delphi, eg:
procedure TMyForm.IdHTTPServer1Connect(AContext: TIdContext);
begin
{$IFNDEF FPC_UNICODESTRINGS}
AContext.Connection.IOHandler.DefAnsiEncoding := IndyTextEncoding_UTF8;
{$ENDIF}
end;
Or:
procedure TIdHTTPResponseInfoHelper.WriteContentUni(
{$IFNDEF FPC_UNICODESTRING}ASrcEncoding: IIdTextEncoding = nil{$ENDIF}
);
begin
{$IFNDEF FPC_UNICODESTRING}
EnsureEncoding(ASrcEncoding, encUTF8);
FConnection.IOHandler.DefAnsiEncoding := ASrcEncoding;
{$ENDIF}
WriteContent;
end;
Everything goes as usual - all bugs are yours, if you think you found a bug in an outer library go back to the start of the phrase. Thanks!
TIdHTTPResponseInfo.WriteContent
usesIOHandler.Write
in two places and in both of them incorrectly. It causes wrong message encoding in Lazarus under Linux (unitIdCustomHTTPServer
) Now it is (line 2188) :FConnection.IOHandler.Write(ContentText, CharsetToEncoding(CharSet));
Should be:
FConnection.IOHandler.Write(ContentText, CharsetToEncoding(CharSet){$IFDEF STRING_IS_ANSI}, CharsetToEncoding(CharSet){$ENDIF});
Now it is (line 2213) :
Should be: