canonical / cloud-init

Official upstream for the cloud-init: cloud instance initialization
https://cloud-init.io/
Other
2.99k stars 880 forks source link

[DataSourceLXD] Wrongly decodes response containing special char #5300

Closed phvalguima closed 5 months ago

phvalguima commented 5 months ago

Bug report

Running: Ubuntu 22.04 LXD LXC created with 22.04 image

Juju is passing snap-assertions as part of the cloud-init user-data whenever using snap-store-proxy. However, some of these assertions contain a display-name which points to the name a given user registered in the Charmhub / Snap store upstream.

In my case, I registered: "Guimarães", which contains an "ã". That gets wrongly decoded by DataSourceLXD class as it leaves to requests to decide the encoding, and concludes on ISO-8859-1 instead of UTF-8 using chardet, called by apparent_encoding.

Using UTF-8 as encoding resolves the problem.

(Pdb) l
377                 cfg_key = config_route.rpartition("/")[-1]
378                 # Leave raw data values/format unchanged to represent it in
379                 # instance-data.json for cloud-init query or jinja template
380                 # use.
381  ->             config["config"][cfg_key] = config_route_response.text
382                 # Promote common CONFIG_KEY_ALIASES to top-level keys.
383                 if cfg_key in CONFIG_KEY_ALIASES:
384                     # Due to sort of config_routes, promote cloud-init.*
385                     # aliases before user.*. This allows user.* keys to act as
386                     # fallback config on old LXD, with new cloud-init images.

(Pdb) p config_route_response.encoding
None
(Pdb) p config_route_response.apparent_encoding
'ISO-8859-1'
(Pdb) bt
  /usr/lib/python3/dist-packages/cloudinit/sources/DataSourceLXD.py(480)<module>()
-> atomic_helper.json_dumps(read_metadata(metadata_keys=MetaDataKeys.ALL))
  /usr/lib/python3/dist-packages/cloudinit/sources/DataSourceLXD.py(454)read_metadata()
-> return _MetaDataReader(api_version=api_version)(
  /usr/lib/python3/dist-packages/cloudinit/sources/DataSourceLXD.py(410)__call__()
-> md.update(self._process_config(session))
> /usr/lib/python3/dist-packages/cloudinit/sources/DataSourceLXD.py(383)_process_config()
-> if cfg_key in CONFIG_KEY_ALIASES:

Setting:

(Pdb) config_route_response.encoding="utf-8"

Resolves the problem and the string is correctly decoded.

Expected Result

I expect the same result as curl:

curl --unix-socket /dev/lxd/sock  http://lxd/1.0/config/user.user-data

... Guimarães
...

Actual Result

... Guimarães
...

Steps to reproduce the problem

Create an user-data with an special char (check my comments above). For example, save this string to a file (in my case, /etc/snap.assertions).

Once cloud init finishes executing, it will render the file above with the wrongly decoded string.

Environment details

$ python3 -m requests.help

# python3 -m requests.help
{
  "chardet": {
    "version": "4.0.0"
  },
  "cryptography": {
    "version": "3.4.8"
  },
  "idna": {
    "version": "3.3"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.10.12"
  },
  "platform": {
    "release": "6.5.0-1020-aws",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "30000020",
    "version": "21.0.0"
  },
  "requests": {
    "version": "2.25.1"
  },
  "system_ssl": {
    "version": "30000020"
  },
  "urllib3": {
    "version": "1.26.5"
  },
  "using_pyopenssl": true
}
root@juju-4996a2-7:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.4 LTS
Release:    22.04
Codename:   jammy
root@juju-4996a2-7:~# uname -r
6.5.0-1020-aws
# lxd --version
5.0.3
TheRealFalcon commented 5 months ago

Thanks for the detailed bug report! I'll have a PR up shortly.