ncruces / go-sqlite3

Go bindings to SQLite using wazero
https://pkg.go.dev/github.com/ncruces/go-sqlite3
MIT License
499 stars 16 forks source link

panic: sqlite3: missing NUL terminator #45

Closed Danlock closed 10 months ago

Danlock commented 10 months ago

Using the array extension, I got a panic when stepping through a statement.

panic: sqlite3: missing NUL terminator

goroutine 1 [running]:
github.com/ncruces/go-sqlite3/internal/util.ReadString({0x7a1470?, 0xc0000b0f00?}, 0x31560, 0x200)
        /home/dan/go/src/github.com/danlock/searmage/vendor/github.com/ncruces/go-sqlite3/internal/util/mem.go:119 +0x11c
github.com/ncruces/go-sqlite3.(*sqlite).error(0xc000139180, 0x14, 0x27410, {0x0, 0x0, 0x1?})
        /home/dan/go/src/github.com/danlock/searmage/vendor/github.com/ncruces/go-sqlite3/sqlite.go:129 +0x18e
github.com/ncruces/go-sqlite3.(*Conn).error(...)
        /home/dan/go/src/github.com/danlock/searmage/vendor/github.com/ncruces/go-sqlite3/conn.go:294
github.com/ncruces/go-sqlite3.(*Stmt).Step(0xc0000ac120)
        /home/dan/go/src/github.com/danlock/searmage/vendor/github.com/ncruces/go-sqlite3/stmt.go:89 +0xf3
github.com/danlock/searmage/db.FilterParsedImages(0x7ffc0a70d339?, {0xc0000b2c00, 0x18, 0x20})
        /home/dan/go/src/github.com/danlock/searmage/db/db.go:48 +0x186
github.com/danlock/searmage/ocr.Parse({0x79e450, 0xc00002a180}, {{0x7ffc0a70d339, 0x13}, 0xc000202420, {0xc0000a6018, 0x15}, 0x0, {0x0, 0x0}})
        /home/dan/go/src/github.com/danlock/searmage/ocr/ocr.go:33 +0xcb
main.main()
        /home/dan/go/src/github.com/danlock/searmage/cmd/searmage/main.go:41 +0x265

The offending function is here https://github.com/Danlock/searmage/commit/b5cda984db9ddf1064e6328746b558b46a2d3450#diff-30c25f2a7d8ae335d2aada4b0a6013ad9d159595bb6a2a14b1d0022451b6753fR31.

Note that I am actually using stmt.BindPointer here, not sure what else would need to be done.

Danlock commented 10 months ago

It may actually be an error reading in the SQL error string, strangely enough. github.com/ncruces/go-sqlite3.(*sqlite).error(0xc000139180, 0x14, 0x27410, {0x0, 0x0, 0x1?}) /home/dan/go/src/github.com/danlock/searmage/vendor/github.com/ncruces/go-sqlite3/sqlite.go:129 +0x18e

        if r := sqlt.call("sqlite3_errmsg", uint64(handle)); r != 0 {
            err.msg = util.ReadString(sqlt.mod, uint32(r), _MAX_NAME)
        }
ncruces commented 10 months ago

Yes, the error message is larger than 512 bytes long, which with custom errors (i.e. Go errors from a virtual table, or custom functions) becomes likely. When I wrote that code I assumed SQLite error messages which tend not to be large (100 bytes, at most), so I used the 512 limit.

Remains to be seen if it makes sense to copy these large error messages from Go->C->Go, or if I should truncate the string somewhere. But a panic just hides the underlying error, so that I need to fix.

If you want to collect more info about the underlying error, changing _MAX_NAME to something like 65536 (64K) should help, and be harmless.

Thanks!

Danlock commented 10 months ago

One question. Isn't the entire WASM memory already completely inside Go's memory?

Why use a maxLen in util.ReadString to cap the mem.Read(), when mem.Read is returning a slice, just a view on top of the WASM memory, and not even copying it?

Essentially, why not just mem.Read(ptr, mem.Size()), and error out only if you can't find the null byte?

Danlock commented 10 months ago

Also I truncated the array I was passing into the statement via sqlite3.Pointer, and it was able to print the error.

2023/12/19 19:30:57 ERROR ocr.Parse db.FilterParsedImages stmt.Err path query sqlite3: datatype mismatch: array: unsupported argument: {[/home/dan/Pictures/aksamqperrorsjuly11.png /home/dan/Pictures/aws_perm_error.png]}

so it seems like you're right and with the array extension the entire array gets copied into the error string. That will easily blow past 512 bytes.

Are there really no other SQLite error messages that include parts of the client query within the returned error message?

ncruces commented 10 months ago

Interesting. Several points:

  1. The array should not be copied, just the type name. That's bad for privacy in logs, etc.
  2. SQLite error messages tend to be short and mostly fixed/constant/literals. When you see SQL in error messages the message from SQLite is actually just (e.g.) "parse error", but then SQLite provides an offset into the SQL, and we print "parse error @" and then concatenate the SQL in our end.
  3. In C these APIs search for the NUL, yes, but only when the string is ultimately used (e.g. printed), and they (mostly) don't involve copies. We need to look for NUL and copy the string several times, from C to Go, convert between byte slices and strings and errors.

There's various things to fix. This should not panic. Limits should be higher. But I also want to avoid a situation where some less trustworthy SQL may cause multiple GBs of data to be allocated and copied repeatedly, putting pressure on the GC, etc. This is also used for things like column names, declared types, things that you call potentially millions of times in a long running process and which should really be cheap.

Danlock commented 10 months ago

Great. I don't get a panic anymore. However, I am still getting this error,

2023/12/21 13:32:08 ERROR ocr.Parse db.FilterParsedImages stmt.Err path query sqlite3: datatype mismatch: array: unsupported argument: sqlite3.pointer[[]string]

Why is a []string an unsupported type? I also don't see it handled in the switch statement at https://github.com/ncruces/go-sqlite3/blob/main/ext/array/array.go#L64.

Or if the issue is on my end, how is this code wrong?

func FilterParsedImages(db *sqlite3.Conn, images []string) ([]string, error) {
    stmt, _, err := db.Prepare(`
        SELECT path FROM images
        WHERE path IN array(?)
    `)
    if err != nil {
        return images, errors.Errorf("DB.Prepare path query %w", err)
    }

    err = stmt.BindPointer(1, sqlite3.Pointer(images))
    if err != nil {
        return images, errors.Errorf("stmt.BindPointer path query %w", err)
    }
    ... etc
}

If I pass in &images instead of sqlite3.Pointer(images), I get another error.

Danlock commented 10 months ago

OK, nevermind, I spoke too soon. I realized if I just pass in the []string directly instead of wrapping it with sqlite3.Pointer, it works fine. The array.Register comment mentions that it must be wrapped with a sqlite3.Pointer, but since it works without, this is just a minor doc issue.

ncruces commented 10 months ago

Ah, yes, I get the confusion.

The extension works with a slice (like []int), an array (like [4]int, but this is copied, or passed by value) or a pointer to an array (like *int[4]). That's what you should bind with BindPointer().

If you use the database/sql driver, you need to wrap those in an sqlite3.Pointer to pass them to Exec(), Query(), QueryRow(), etc. That's because you need to tell the driver this is a “pointer” to bind with BindPointer(). You could also bind the same array as (e.g.) JSON using sqlite.JSON, which would call BindJSON().

I'll try to improve the documentation.