kolo / xmlrpc

Implementation of XMLRPC protocol in Go language.
MIT License
159 stars 94 forks source link

replace illegal character code #86

Closed Just-Like closed 4 months ago

Just-Like commented 4 months ago

If the data contains illegal character codes like U+001B, the unmarshalling process will fail image so they should be replaced.

icholy commented 4 months ago

Sorry, the existing behaviour is correct. If the library is fed malformed/invalid data, it should error.

Just-Like commented 4 months ago

Sorry, the existing behaviour is correct. If the library is fed malformed/invalid data, it should error.

However, the data comes from the server, and the client cannot control the data sent by the server. Additionally, the response received in this packet is a private variable and cannot be externally replaced with illegal strings.

icholy commented 4 months ago

You can set the xmlrpc.CharsetReader variable https://github.com/kolo/xmlrpc/blob/a4b6fa1dd06bbefa509944742c219846044ed934/decoder.go#L25

Just-Like commented 4 months ago

You can set the xmlrpc.CharsetReader variable

https://github.com/kolo/xmlrpc/blob/a4b6fa1dd06bbefa509944742c219846044ed934/decoder.go#L25

https://cs.opensource.google/go/go/+/master:src/encoding/xml/xml.go;l=637;drc=32014d549609422748dcb698fef1d43a5a33b0b4 I checked the XML source code, and the CharsetReader method is only useful for parsing XML that is not UTF-8. However, the XML I need to parse is UTF-8; it just contains illegal characters. So even if I define a CharsetReader variable, it won't be effective. I still need to replace the illegal characters.

icholy commented 4 months ago

I tried to take a look but your tests pass without the changes you're proposing.

icholy commented 4 months ago

You're right about the CharsetReader though. You'll have to create a custom http.RoundTripper and pass that to xmlrpc.NewClient.

type ReplaceBodyTransport struct {
    Transport http.RoundTripper
    Replace   func(body []byte) []byte
}

func (t *ReplaceBodyTransport) RoundTrip(req *http.Request) (*http.Response, error) {
    tr := t.Transport
    if tr == nil {
        tr = http.DefaultTransport
    }
    res, err := tr.RoundTrip(req)
    if err != nil || res.StatusCode != http.StatusOK {
        return res, err
    }
    body, err := io.ReadAll(res.Body)
    if err != nil {
        return nil, err
    }
    res.Body = io.NopCloser(bytes.NewReader(t.Replace(body)))
    return res, err
}
Just-Like commented 4 months ago

You're right about the CharsetReader though. You'll have to create a custom http.RoundTripper and pass that to xmlrpc.NewClient.

type ReplaceBodyTransport struct {
  Transport http.RoundTripper
  Replace   func(body []byte) []byte
}

func (t *ReplaceBodyTransport) RoundTrip(req *http.Request) (*http.Response, error) {
  tr := t.Transport
  if tr == nil {
      tr = http.DefaultTransport
  }
  res, err := tr.RoundTrip(req)
  if err != nil || res.StatusCode != http.StatusOK {
      return nil, err
  }
  body, err := io.ReadAll(res.Body)
  if err != nil {
      return nil, err
  }
  res.Body = io.NopCloser(bytes.NewReader(t.Replace(body)))
  return res, err
}

This approach seems feasible, but it relies on the rpc.Client object, requiring the initiation of a client service. Sometimes, you might only want to parse an XML-RPC response without establishing a server connection. Could you provide a hook function for manipulating the server's response data externally or offer an unmarshal interface to allow clients to define their own unmarshaling logic?

icholy commented 4 months ago

I don't think I understand your usecase. Can you provide some example code of using this library without a client instance? I'm not opposed to the idea of adding a hook.

Just-Like commented 4 months ago

I tried to take a look but your tests pass without the changes you're proposing.

I have pushed the latest code to my repository. Can you take a look? https://github.com/Just-Like/xmlrpc/tree/fix/replaceIllegalCharacterCode

Just-Like commented 4 months ago

I don't think I understand your usecase. Can you provide some example code of using this library without a client instance? I'm not opposed to the idea of adding a hook.

Currently, there is no such scenario. I just feel that allowing unmarshal and marshal as public methods for external invocation, and permitting users to add their own processing logic, can decouple them, making usage more convenient and enhancing extensibility.

icholy commented 4 months ago

There's there Response.Unmarshal method that's publicly available.

However, since this is a pretty niche use-case, I think that the http.RoundTripper approach is good enough. If more people chime in, I'll reconsider adding a hook.

Just-Like commented 4 months ago

ok thanks