dolthub / driver

Golang database/sql driver for Dolt
Apache License 2.0
22 stars 5 forks source link

Multi statement mode doesn't handle procedures correctly #28

Closed arvidfm closed 2 months ago

arvidfm commented 3 months ago

The driver naively splits multi-statement queries by ;, which doesn't match how Vitess and GMS process the queries: In particular, procedure definitions are not parsed as a single statement, and DELIMITER statements are not respected.

Additionally, the driver processes multi-statement queries by executing the statements directly in Prepare, which is unexpected behaviour.

Demonstration of the issue; the below yields the error Error 1105: syntax error at position 42 near '1' when running the last query, but works correctly if connecting to a Dolt server using the MySQL driver:

package main

import (
    "context"
    "database/sql"
    "net/url"
    "os"

    _ "github.com/dolthub/driver"
)

func main() {
    dir := must(os.MkdirTemp("", "dolt-test-db-*"))
    defer os.RemoveAll(dir)

    params := url.Values{
        "database": []string{"testdb"}, "multistatements": []string{"true"},
        "commitname": []string{"a"}, "commitemail": []string{"a@a.com"}}
    dsn := url.URL{Scheme: "file", Path: dir, RawQuery: params.Encode()}
    db := must(sql.Open("dolt", dsn.String()))

    ctx := context.Background()
    must(db.ExecContext(ctx, "CREATE DATABASE testdb"))
    must(db.ExecContext(ctx, "COMMIT")) // workaround for implicit transactions not being committed

    must(db.ExecContext(ctx, `
CREATE PROCEDURE myproc()
BEGIN
    SELECT 1;
END;

CALL myproc();
`))
}

func must[T any](t T, err error) T {
    if err != nil {
        panic(err)
    }
    return t
}

I had a quick look at it myself, but it wasn't obvious what the best approach would be - perhaps the easiest would be to implement ExecContext and QueryContext and move the multi-statement processing there, maybe using MysqlParser from GMS to split the queries and calling QueryWithBindings for each, if prepared multi-statement queries aren't a requirement? (They don't seem to be supported by MySQL anyway.) Would be nice if the result also implemented driver.RowsNextResultSet.

jycor commented 3 months ago

Hey @arvidfm, looks like using the MySQLParser worked out nicely for properly parsing stored procedures and triggers. The PR for that has been merged to main.

I'm currently working on implementing RowsNextResultSet and will keep you posted.

fulghum commented 3 months ago

Hi @arvidfm – James is out of the office for some time off for the next few weeks, so I'm picking up the rest of this one. I've got a very rough prototype working that adds support for the driver.RowsNextResultSet interface and allows you to iterate through multiple result sets. I still need to do a lot of cleanup and testing, but wanted to let you know this work is moving forward. I'll keep you updated and I'm hoping to get a PR out by the end of this week (we're off on holiday on Thursday, btw).

Thanks for the great feature request on this one!

arvidfm commented 3 months ago

@fulghum Thanks for the update! Looking forward to it, it'll make Dolt integration a lot easier for us. Let me know if/once you have something you'd like me to try out and I'd be happy to jump on it.

fulghum commented 2 months ago

Hi @arvidfm, sorry for the delay on the update here... as I did more testing with my initial implementation, I realized there were some behavioral differences between it and the MySQL driver implementation, so I've spent some time digging in to the MySQL driver and I've pushed a new implementation that should behave more similarly.

I'm still wrapping up some new tests for some of the MySQL driver behavior I found and I'm cleaning up some sample code, but if you'd like to give this implementation an early test drive, you can pull it from the fulghum/dev branch. If you do happen to get time to try it out, feel free to send along any feedback or issues you have and I'll be happy to dig in.

arvidfm commented 2 months ago

@fulghum I gave it a spin, and from my preliminary testing it seems to work perfectly with our migrations, with the exception of one minor issue: If the query string contains a comment that's not followed by any further statements, you get an empty statement error:

SELECT 1;
# comment not followed by statements

This doesn't seem to be an issue when connecting to a Dolt server using the MySQL driver.

fulghum commented 2 months ago

Awesome! Thanks for such a fast turn around on some early testing help. 🙏 Good find on the case with the trailing comment. I will add that to my testing today and get that case working, too.

fulghum commented 2 months ago

Yesterday I added support for skipping over empty statements and some test coverage for that, too. I added a bunch more tests, and also hooked up the tests so that they can run against a MySQL DB using the MySQL driver, so that we can easily test that the Dolt driver has the same behavior as the MySQL driver.

There was one last behavior difference that was bugging me... the MySQL driver will automatically skip over non-query result sets (e.g. if a multi-statement query contains an INSERT statement or DDL). I added support for that to the Dolt driver today and asked a teammate for a review.

When you get a chance, feel free to try out the latest code on the fulghum/dev branch and let me know if you find any more issues. Once I get it through review, we'll release a new version, with a minor version bump, since there is a behavior change from previous versions (previous multistatement queries would return only the last result set, and now we start with the first query result set and customers will have to use NextResultSet() to iterate through them).

Thanks again for the great feature request here. This was a fun one to work on, and will make the Dolt driver easier to use and more compatible with the MySQL driver.

arvidfm commented 2 months ago

@fulghum Gave it a whirl, I also had it run against a MySQL server using the MySQL driver in tandem and compared the results (number of result sets returned, number of results, column names, result values etc), and everything seems to work perfectly!

The only difference my comparison program even picked up is that the MySQL driver apparently returns strings as []byte while the Dolt driver returns strings as string, but either way they scan into both []byte and string variables just fine, so that's not an issue. Might be good at some point to do a comprehensive comparison of the drivers on every single data type supported by MySQL and verify that they're passed as the same type from the driver, but that's most definitely not within the scope of this issue. As far as I'm concerned, it all looks good!

fulghum commented 2 months ago

Awesome! 🎉 Thanks again for the help with the acceptance testing.

I've merged in support for querying multiple result sets through the driver.RowsNextResultSet interface and created the 0.2.0 release of the driver.

Feel free to let us know if you hit any snags or have other feature requests for us. Thanks for being a Dolt customer! 🙏

arvidfm commented 2 months ago

Amazing, thank you so much!