saltstack / salt

Software to automate the management and configuration of any infrastructure or application at scale. Get access to the Salt software package repository here:
https://repo.saltproject.io/
Apache License 2.0
14.11k stars 5.47k forks source link

[BUG] cp.get_file returns files with zero length #63837

Open onlineque opened 1 year ago

onlineque commented 1 year ago

Description I have a file, which is in fact a shell script with a binary blob attached to it. That blob begins at about 20KB, right after the script itself (which is a shell script, which means it is a text). When trying to run: salt '' cp.get_file salt://myfile /tmp/myfile

I am getting a file with zero length as the result in /tmp/myfile. I am using pygit2 as the fileserver backend.

Looks like the issue comes from salt/utils/gitfs.py here, line3054:

  3029      def serve_file(self, load, fnd):
  3030          """
  3031          Return a chunk from a file based on the data received
  3032          """
  3033          if "env" in load:
  3034              # "env" is not supported; Use "saltenv".
  3035              load.pop("env")
  3036  
  3037          ret = {"data": "", "dest": ""}
  3038          required_load_keys = {"path", "loc", "saltenv"}
  3039          if not all(x in load for x in required_load_keys):
  3040              log.debug(
  3041                  "Not all of the required keys present in payload. Missing: %s",
  3042                  ", ".join(required_load_keys.difference(load)),
  3043              )
  3044              return ret
  3045          if not fnd["path"]:
  3046              return ret
  3047          ret["dest"] = fnd["rel"]
  3048          gzip = load.get("gzip", None)
  3049          fpath = os.path.normpath(fnd["path"])
  3050          with salt.utils.files.fopen(fpath, "rb") as fp_:
  3051              fp_.seek(load["loc"])
  3052              data = fp_.read(self.opts["file_buffer_size"])
  3053              if data and not salt.utils.files.is_binary(fpath):
  3054                  data = data.decode(__salt_system_encoding__)
  3055              if gzip and data:
  3056                  data = salt.utils.gzip_util.compress(data, gzip)
  3057                  ret["gzip"] = gzip
  3058              ret["data"] = data
  3059          return ret

Seems in my case the file is wrongly guessed to be text, but in fact, it's not, causing this decode on line 3054 to fail on an exception., resulting in the destination file to be created with zero length and no data.

What I propose is to write that part in slightly different way:

  3050          with salt.utils.files.fopen(fpath, "rb") as fp_:
  3051              fp_.seek(load["loc"])
  3052              data = fp_.read(self.opts["file_buffer_size"])
  3053              if data and not salt.utils.files.is_binary(fpath):
  3054                  try:
  3055                      data = data.decode(__salt_system_encoding__)
  3056                  except UnicodeDecodeError:
  3057                      pass
  3058              if gzip and data:
  3059                  data = salt.utils.gzip_util.compress(data, gzip)
  3060                  ret["gzip"] = gzip
  3061              ret["data"] = data
  3062          return ret

so even wrong guess will not crash it, resulting in file being correctly transferred to destination.

Setup

Steps to Reproduce the behavior Create a text file which is of size more than 2048 bytes then append a binary one right after it. Such a way vendors are distributing installers for their products in a single file, which looks like a shell script, but contains the actual software bundled in the very same file. Then put it on the Salt fileserver which has its backend set to pygit2 and run: salt '' cp.get_file salt://myfile /tmp/myfile

You will receive file of a zero length with no content. Salt master log will be spammed with the following error:

2023-03-07 23:00:24,755 [salt.master      :1916][ERROR   ][1382964] Error in function _serve_file:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/salt/master.py", line 1910, in run_func
    ret = getattr(self, func)(load)
  File "/usr/lib/python3.6/site-packages/salt/fileserver/__init__.py", line 629, in serve_file
    return self.servers[fstr](load, fnd)
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
    return self.loader.run(run_func, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1228, in run
    return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/salt/loader/lazy.py", line 1243, in _run_as
    return _func_or_method(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/salt/fileserver/gitfs.py", line 176, in serve_file
    return _gitfs().serve_file(load, fnd)
  File "/usr/lib/python3.6/site-packages/salt/utils/gitfs.py", line 3054, in serve_file
    data = data.decode(__salt_system_encoding__)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 23076: invalid start byte

Expected behavior I expect the file being transferred correctly under any circumstances.

Screenshots

Versions Report

salt --versions-report (Provided by running salt --versions-report. Please also mention any differences in master/minion versions.) ```yaml Salt Version: Salt: 3005.1 Dependency Versions: cffi: 1.11.5 cherrypy: unknown dateutil: 2.6.1 docker-py: Not Installed gitdb: Not Installed gitpython: Not Installed Jinja2: 2.10.1 libgit2: 0.26.8 M2Crypto: 0.35.2 Mako: Not Installed msgpack: 0.6.2 msgpack-pure: Not Installed mysql-python: Not Installed pycparser: 2.14 pycrypto: Not Installed pycryptodome: Not Installed pygit2: 0.26.4 Python: 3.6.8 (default, Nov 8 2022, 11:32:15) python-gnupg: Not Installed PyYAML: 3.12 PyZMQ: 20.0.0 smmap: Not Installed timelib: Not Installed Tornado: 4.5.3 ZMQ: 4.3.4 System Versions: dist: rocky 8.7 Green Obsidian locale: UTF-8 machine: x86_64 release: 4.18.0-425.10.1.el8_7.x86_64 system: Linux version: Rocky Linux 8.7 Green Obsidian ```

OrangeDog commented 1 year ago

Has the binary attribute been set on the file in your repo?

Either way, it wouldn't hurt for Salt to check for it rather than trying to guess.

onlineque commented 1 year ago

I have tried with binary set in .gitattributes for the given file, but no luck, it's still the same. Zero length. I've marked all my .sh files as binary in .gitattributes this way:

*.sh binary
Ch3LL commented 1 year ago

@onlineque would you mind submitting your fix as a PR alongside test coverage?

onlineque commented 1 year ago

I'd be happy to do that ! :-)