minio / mc

Unix like utilities for object store
https://min.io/download
GNU Affero General Public License v3.0
2.86k stars 548 forks source link

xml: encoding "iso-8859-1" declared but Decoder.CharsetReader is nil #4969

Closed CGsama closed 4 months ago

CGsama commented 4 months ago

Expected behavior

Able to do s3 compatible operations on Less3, which successfully by awscli and s3 browser

Actual behavior

All operations will throw the same error

seems similar to this case https://stackoverflow.com/questions/6002619/unmarshal-an-iso-8859-1-xml-input-in-go

Steps to reproduce the behavior

install Less3 add to alias do any operation

mc --version

mc version RELEASE.2024-06-24T19-40-33Z (commit-id=3548007d5bd1ba8e2ed98f35d95c4776dd9a7ff9)
Runtime: go1.22.4 windows/amd64
Copyright (c) 2015-2024 MinIO, Inc.
License GNU AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>

System information

mc ls --debug less3
mc.exe: <DEBUG> GET / HTTP/1.1
Host: localhost:9000
User-Agent: MinIO (windows; amd64) minio-go/v7.0.72 mc.exe/RELEASE.2024-06-24T19-40-33Z
Accept-Encoding: zstd,gzip
Authorization: AWS **REDACTED**:**REDACTED**
Date: Fri, 28 Jun 2024 15:19:54 GMT

mc.exe: <DEBUG> HTTP/1.1 200 OK
Content-Length: 482
Accept: */*
Accept-Charset: ISO-8859-1, utf-8
Accept-Language: en-US, en
Access-Control-Allow-Headers: *
Access-Control-Allow-Methods: OPTIONS, HEAD, GET, PUT, POST, DELETE
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Type: application/xml
Date: Fri, 28 Jun 2024 15:19:54 GMT
Server: nginx/1.18.0 (Ubuntu)
X-Amz-Id-2: acd28d0d-d9ed-4ad5-a9a5-3990c38031ba
X-Amz-Request-Id: acd28d0d-d9ed-4ad5-a9a5-3990c38031ba

mc.exe: <DEBUG> Response Time: 24.8508ms

mc.exe: <ERROR> Unable to list folder. xml: encoding "iso-8859-1" declared but Decoder.CharsetReader is nil
 (1) github.com/minio/mc/cmd/ls.go:239 cmd.doList(..) Tags: [http://localhost:9000/]
 (0) github.com/minio/mc/cmd/client-s3.go:2357 cmd.(*S3Client).listInRoutine(..)
 Release-Tag:RELEASE.2024-06-24T19-40-33Z | Commit:3548007d5bd1 | Host:MYPC | OS:windows | Arch:amd64 | Lang:go1.22.4 | Mem:4.0 MiB/18 MiB | Heap:4.0 MiB/12 MiB
marktheunissen commented 4 months ago

Hi @CGsama it's unlikely we will start supporting this, as you've correctly spotted the issue is that the server (Less3) is sending this header: Accept-Charset: ISO-8859-1, utf-8.

Go's stdlib by default only accepts utf8.

// CharsetReader, if non-nil, defines a function to generate
// charset-conversion readers, converting from the provided
// non-UTF-8 charset into UTF-8. If CharsetReader is nil or
// returns an error, parsing stops with an error. One of the
// CharsetReader's result values must be non-nil.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)

You've already opened an issue with Less3: https://github.com/jchristn/Less3/issues/9

They just need to remove this header from the response, Amazon S3 does not send this header. We are compatible with S3, if they also want to be compatible then they must use the same responses as S3 has.

marktheunissen commented 4 months ago

Or maybe the header is being set by your nginx server configuration? Server: nginx/1.18.0 (Ubuntu)

marktheunissen commented 4 months ago

Another thing to get them to look at is the XML response, it should be <?xml version="1.0" encoding="UTF-8"?>

CGsama commented 4 months ago

Hi @CGsama it's unlikely we will start supporting this, as you've correctly spotted the issue is that the server (Less3) is sending this header: Accept-Charset: ISO-8859-1, utf-8.

Go's stdlib by default only accepts utf8.

// CharsetReader, if non-nil, defines a function to generate
// charset-conversion readers, converting from the provided
// non-UTF-8 charset into UTF-8. If CharsetReader is nil or
// returns an error, parsing stops with an error. One of the
// CharsetReader's result values must be non-nil.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)

You've already opened an issue with Less3: jchristn/Less3#9

They just need to remove this header from the response, Amazon S3 does not send this header. We are compatible with S3, if they also want to be compatible then they must use the same responses as S3 has.

Hi Mark, thank you very much for the advice! I tries nginx to remove the header but not work, after wireshark I have success locate the issue is the REST response at https://github.com/jchristn/S3Server/blob/253c325de052eaa3ef6d5c97783e0a604860cb80/src/S3Server/SerializationHelper.cs#L174 So that it force ISO-8859-1 in XML encoding field. Change it to UTF8 solved it!

klauspost commented 4 months ago

Pretty much all xml decoding would have to be rewritten in minio-go to something like:

    d := xml.NewDecoder(resp.Body)
    d.CharsetReader = func(cs string, input io.Reader) (io.Reader, error) {
        enc, _ := charset.Lookup(cs)
        if enc == nil {
            return nil, fmt.Errorf("unsupported charset: %q", cs)
        }
        return enc.NewDecoder().Reader(input), nil
    }

Given this isn't a more prevalent issue, I don't see a big need for this currently.