rogerbinns / apsw

Another Python SQLite wrapper
https://rogerbinns.github.io/apsw/
Other
715 stars 96 forks source link

Output JSON doesn't handle 0 byte objects #482

Closed fzakaria closed 9 months ago

fzakaria commented 10 months ago

I have the following table:

❯ sqlelf /usr/bin/ruby --sql "select * from elf_sections;"
┌───────────────┬────────────────────┬────────┬──────┬─────────────┬───────────────┐
│     path      │        name        │ offset │ size │    type     │    content    │
│ /usr/bin/ruby │                    │ 0      │ 0    │ NULL        │ [ 0 bytes ]   │
│ /usr/bin/ruby │ .interp            │ 792    │ 28   │ PROGBITS    │ [ 28 bytes ]  │

Hitting this issue:

❯ sqlelf /usr/bin/ruby --sql ".mode json" --sql "select * from elf_sections;"
Traceback (most recent call last):
  File "/nix/store/z6s3zv40jypbf3a1p361l94m1n8k15ay-python3.10-poetry2nix-env-scripts/bin/.sqlelf-wrapped", line 13, in <module>
    sys.exit(start())
  File "/usr/local/google/home/fmzakari/code/github.com/fzakaria/sqlelf/sqlelf/cli.py", line 87, in start
    shell.process_complete_line(sql)
  File "/nix/store/d5fyfc6p83l4zf4kcxifwkakzjz6800d-python3-3.10.12-env/lib/python3.10/site-packages/apsw/shell.py", line 2960, in process_complete_line
    self.process_sql(command)
  File "/nix/store/d5fyfc6p83l4zf4kcxifwkakzjz6800d-python3-3.10.12-env/lib/python3.10/site-packages/apsw/shell.py", line 976, in process_sql
    self.output(False, row)
  File "/nix/store/d5fyfc6p83l4zf4kcxifwkakzjz6800d-python3-3.10.12-env/lib/python3.10/site-packages/apsw/shell.py", line 641, in output_json
    out = ["%s: %s" % (self._fmt_json_value(k), fmt(line[i])) for i, k in enumerate(self._output_json_cols)]
  File "/nix/store/d5fyfc6p83l4zf4kcxifwkakzjz6800d-python3-3.10.12-env/lib/python3.10/site-packages/apsw/shell.py", line 641, in <listcomp>
    out = ["%s: %s" % (self._fmt_json_value(k), fmt(line[i])) for i, k in enumerate(self._output_json_cols)]
  File "/nix/store/d5fyfc6p83l4zf4kcxifwkakzjz6800d-python3-3.10.12-env/lib/python3.10/site-packages/apsw/shell.py", line 640, in <lambda>
    fmt = lambda x: self.colour.colour_value(x, self._fmt_json_value(x))
  File "/nix/store/d5fyfc6p83l4zf4kcxifwkakzjz6800d-python3-3.10.12-env/lib/python3.10/site-packages/apsw/shell.py", line 440, in _fmt_json_value
    if o[-1] == "\n":
IndexError: string index out of range

Here is the line of code causing the problem:

        elif isinstance(v, bytes):
            o = base64.encodebytes(v).decode("ascii")
            if o[-1] == "\n":
                o = o[:-1]
            return '"' + o + '"'

I think the problem is that we are encoding 0 bytes but trying to access the last element. It should probably guard against that.

Let me know if you want me to take a stab at the PR.

rogerbinns commented 10 months ago

Some random old version of Python used to append newlines to its output. I think base64 requires newline termination. The json module rejects binary data. The code was also manually constructing the JSON as it was written before json in Python existed! In any event I have a commit that strips whitespace from base64 output and uses the json module to encode all other values. Ticket remains open because it still needs test cases.