golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.88k stars 17.52k forks source link

database/sql: missing escape functions #18478

Open nefthy opened 7 years ago

nefthy commented 7 years ago

What version of Go are you using (go version)?

go version go1.7.4 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64" GOBIN="" GOEXE="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOOS="linux" GOPATH="/home/nefthy/go-test/" GORACE="" GOROOT="/usr/lib/go" GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64" CC="x86_64-pc-linux-gnu-gcc" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/home/nefthy/go-test/tmp/go-build451484149=/tmp/go-build -gno-record-gcc-switches" CXX="x86_64-pc-linux-gnu-g++" CGO_ENABLED="1"

What did you do?

There are situations when strings need to be escaped in queries that can not be done with placeholders. An example the following queries cannot be expressed with ? placeholders:

SELECT id, ? FROM table
-- Must be escaped as an identifier
SELECT id FROM ?
-- Also identifier quoting
SELECT id FROM table WHERE ? LIKE ?
-- With either the first or second parameter being a column reference

Using Sprintf is no option, since the identifiers need to be properly quoted. The quoting and escaping is inherently vendor specific and may even depend on configuration on a per database/connection basis (hello there MySql...).

What did you expect to see?

The driver must export Quoting which are passed along by the database/sql Api. As far as I can tell the folling functions are needed

What did you see instead?

No escaping/quoting functions

kardianos commented 7 years ago

There isn't anything here that must live in database/sql to be useful. What you are talking about are primitives needed for a query builder. I also build queries programmatically and agree, that QuoteIdentifier and QuoteValue are two functions which are useful in something like this.

If you are interested in this, I would create a new public go package with something like this:

package sqlutil

type Quoter interface {
    QuoteIdentifier(name string) string
    QuoteValue(v interface{}) string // operates on both string and []byte and int or other types.
}

func DBQuote(d driver.Driver) Quoter {
    if q, is := d.(Quoter); is {
        return q
    }
    // TODO: hard-code some known driver types that you want to support
}

At that point you could suggest to drivers that they upstream your Quoter interface.

You would use the Quoter interface like:

var q Quoter = // Get from db type.
sql := fmt.Sprintf(`select %[1]s  from %[2]s where Cats = %[3]s;`,
    QuoteIdentifier("Local ID"), QuoteIdentifier("My Table"),
    QuoteValue("mice's"),
)
// sql = `select [Local ID] from [My Table] where Cats = 'mice''s';` // SQL Server
// sql = `select "Local ID" from "My Table" where Cats = 'mice''s';` // Postgresql

In other words, it would always be a pre-processor step to the actual DB.Query methods. I do agree MySQL is especially bad here.

I can't really see this going into the std library. Have you talked to any sql driver maintainers about this or do you have any issues you can link here?

nefthy commented 7 years ago

I would argue to the contrary. Escaping primitives are needed for working securely with databases and they are inherently dependent on the SQL-server behind the connection. A package using database.sql might not even know what server it is talking to and how to properly quote and escape for that server, if it just gets passed a reference.

As is, database.sql can only be used for queries known ahead of time and under the constraint, that the only dynamic entities in the query are value and never columns (think ORDER BY <user input>) or tables

kardianos commented 7 years ago

database/SQL abstracts the database interface, not the query text. You can build up the query text today with go packages that exist today.

nefthy commented 7 years ago

quoting query text is part of the database interface.

nefthy commented 7 years ago

placeholders are also abstracted. Native postgres uses $<number> not ?, oracle :<name>

dominikh commented 7 years ago

Placeholders are not abstracted. database/sql itself doesn't care at all, it just passes the query to a driver. And most drivers I know of don't abstract it away, either, requiring you to use $1 for PostgreSQL and ? for SQLite.

Similarly, quoting arguments would depend on the specific DBMS.

nefthy commented 7 years ago

So proper quoting is impossible, since neither database.sql nor database.sql.driver export quoting functions and when I am sitting behind those apis my best bet is type-switching the driver to find out what I am talking to and worst case implementing that my self? That does not look solid to me.

bradfitz commented 7 years ago

Maybe the current situation is not ideal, but there have been no solid proposals yet that consider all popular databases. If the driver community wants to put together a proposal, we're all ears.

ghost commented 7 years ago

Perl/DBI does this right. Golang needs to do the same.

bradfitz commented 7 years ago

@caseyallenshobe, see https://golang.org/wiki/NoMeToo

If you have a proposal, please propose.

ghost commented 7 years ago

I didn't say "me too", actually. I referenced a good implementation. Check out DBI and quote_identifier / quote_literal functions. Just like database/sql, DBI leaves platform-specific implementation up to the drivers.

bradfitz commented 7 years ago

That's more helpful. For what it's worth, I lived and breathed Perl and Perl's DBI for a decade+, and that's what inspired my writing the database/sql package.

I'll leave this to @kardianos, who's taken over maintenance of the package.

kardianos commented 7 years ago

I'm generally in favor of doing this. However, I'm not sure if or where in the std repo it would live. For now I've implemented a first pass https://godoc.org/github.com/golang-sql/sqlexp .

nefthy commented 7 years ago

MySQL has configuration options for quoting, so the quoting is also dependent on the connection.

As a user I would expect the function to be on the sql.DB object, since I probably already have a reference to it when I am constructing queries and sql.DB already has all the information to set up the quoting. Thats also where you find them in similar packages for other languages like Perl/DB, PHP/PDO, PHP/Mysqli.

kardianos commented 7 years ago

@nefthy Good point about MySQL especially. I've updated the calls to be network friendly.

You can get the driver from sql.DB.Driver() so while it isn't a direct method from it in this shim, you can use in in a similar way right now.

Here's what I'd like to see:

  1. evolve and verify the API that is in sqlexp,
  2. get a few drivers to implement it (they usually appreciate PRs),
  3. then if it makes sense (which I think it would at that point), bring it into the std lib.

Perhaps it would be good to combine the escaper functions with the database name function. We'd probably want to make it easy to expand it in the future as well, which would mean changing it into a struct with methods.

hareesh-blippar commented 7 years ago

This is especially problematic for batch inserts, since database/sql doesn't currently support batch inserts, and the only way to do them today is by constructing the raw query string.

ghost commented 7 years ago

Likewise for database-side arrays and variable-amount IN clauses.

kardianos commented 7 years ago

@hareesh-blippar Could you give an example of what you mean by batch inserts, like COPY IN or Bulk Copy?

@caseyallenshobe database side arrays will be handled in a different way, by adding support for them.

hareesh-blippar commented 7 years ago

@kardianos For example, INSERT INTO mytable (a,b,c) VALUES (a1,b1,c1), (a2,b2,c2), (a3,b3,c3), ...... (an,bn,cn);

ghost commented 7 years ago

Batch inserts would be inserting multiple rows with a single statement, I believe.

And yes, Go 1.9 promises something better, as does sqlx today. My company is still stuck on Go 1.4.3 and upgrades slowly, so it will be quite a while for us yet...

Forced needs for implementing something to properly quote to overcome some shortcoming aren't the only reason either...just a good example. :)

kardianos commented 7 years ago

@hareesh-blippar I don't understand why database/sql doesn't understand multi-row value insert today. You would need to add those as parameters (which would be great for a little 10 line function to do, but other then that, it should be easy enough... Are you expecting a different API to support this?

@caseyallenshobe Well, your situation would give additional weight to develop this out of tree first, then anyone can use it regardless of go version.

hareesh-blippar commented 7 years ago

@kardianos is that possible with database/sql today when the value of n in my example is variable and in the order of thousands?

kardianos commented 7 years ago

I would think so. SQL Server can only insert 1000 rows with that syntax, so built a little helper around it to break it up into statements of 1000. So I don't see why not.

hareesh-blippar commented 7 years ago

@kardianos What I meant is that you won't be able to use prepared statements to insert multiple rows today. A solution I found was http://stackoverflow.com/a/25192138/2210093, but it would be awesome if the API were to directly support bulk inserts with prepared statements similar to http://docs.oracle.com/javase/8/docs/api/java/sql/PreparedStatement.html#addBatch--

kardianos commented 7 years ago

I don't think that will ever happen. multi-row value statements don't really need to be done in re-used baches, unless you can benchmark and tell me otherwise. But without resorting to really database specific functionality, most won't support it. Just construct the structure on the fly and add in the params and send at once. If performance is an issue, open an issue dedicated to that.

titpetric commented 7 years ago

Ignoring multi-row statements for now (off topic), what's the current state of getting a quoter into the stdlib? Is this one of those issues which might be marked as help-wanted? If I have some free time on my hands (and some experience with mentioned prior art like PHP PDO and mysqli), what are the best steps I could make to help move this along (write code, put together a proposal,...?)

akamajoris commented 6 years ago

A temporary solution for strings from the Internet:

func escape(source string) string {
    var j int = 0
    if len(source) == 0 {
        return ""
    }
    tempStr := source[:]
    desc := make([]byte, len(tempStr)*2)
    for i := 0; i < len(tempStr); i++ {
        flag := false
        var escape byte
        switch tempStr[i] {
        case '\r':
            flag = true
            escape = '\r'
            break
        case '\n':
            flag = true
            escape = '\n'
            break
        case '\\':
            flag = true
            escape = '\\'
            break
        case '\'':
            flag = true
            escape = '\''
            break
        case '"':
            flag = true
            escape = '"'
            break
        case '\032':
            flag = true
            escape = 'Z'
            break
        default:
        }
        if flag {
            desc[j] = '\\'
            desc[j+1] = escape
            j = j + 2
        } else {
            desc[j] = tempStr[i]
            j = j + 1
        }
    }
    return string(desc[0:j])
}
kardianos commented 6 years ago

@akamajoris That might work for some databases in some configurations. That won't work for systems such as MS SQL Server for sure. My SQL can configure escape parameters, so that must be done with case on that system.

Also, you typically need to escape two different types, identifiers and strings.

ahammond commented 5 years ago

I ran into this while looking to convert some SQL commands from dynamic to static. Painful.

ghost commented 5 years ago

One case not yet discussed: Generating SQL without executing. For a patch generator I am in dire need of the equivalent mysql_real_escape in Go (no, mysql_dump is insufficient).

As it stands, I have two options:

  1. Write a module that interfaces with libmysql.so.
  2. Write my own implementation of mysql_real_escape that worksformebutnotanyoneelse™.

Both are not very attractive, I guess I'll go with the latter option and keep fixing it till it stops breaking?

methane commented 5 years ago

Even though MySQL behavior is different ANSI_QUOTES, "You can still use to quote identifiers with this mode enabled." We can use backslash always. It meansQuoteIdentifier` doesn't need connection. It can be added to Driver directly, instead of connection.

On the other hand, QueryString can not chose how to escape backslash without NO_BACKSLASH_ESCAPE variable. But if we want to add it on Driver, I think it's OK to assume NO_BACKSLASH_ESCAPE is disabled.

BTW, if we add QuoteString, why no FormatDate, FormatDecimal, FormatFloat, FormatGeometry, FormatJSON, etc, etc...? I am very unsure about database/sql/driver should provide such helper functions to build SQL. How about API like DB.FormatSQL(sqlWithPlaceHolder string, args... driver.Value) (string, error)?

Note that QuoteIdentifier is special, because we can not use placeholder for it on some RDB including MySQL. So QuoteIdentifier will be worth enough even if DB.FormatSQL is added.

@dbuteyn You can copy escaping code from go-sql-driver/mysql.

ghost commented 5 years ago

@methane Thanks, its actually the first place I went to but I couldn't find what I was looking for at a glance. A string replace was sufficient for my needs and I didn't want to waste any more time. Obviously what I did is in no way correct and won't work for others.

methane commented 5 years ago

@dbuteyn https://github.com/go-sql-driver/mysql/blob/d0a548181995c293eb09c61ef80099ba1cdbe8f5/connection.go#L272-L291

David00 commented 4 years ago

I'm new to Go so my apologies if this is extraordinarily obvious to others, but I have managed to use Sprintf to create a string for the query containing a dynamic table name for Sqlite3.

import (
        _ "github.com/mattn/go-sqlite3"
    )

func main() {

    var result string
    tableName := "testTable"
    value := "10"
    dynamicQuery := fmt.Sprintf("select myField from '%s' where myValue = '%s'", tableName, value)

        database, _ := sql.Open("sqlite3", "/path/to/my/Sqlite3.db")
    rows, err := database.Query(dynamicQuery)
    defer rows.Close()
    if err != nil {
            fmt.Println("Error running db Query.")
        fmt.Println(err)
    } else {
            for rows.Next() {
                err := rows.Scan(&result)
                if err != nil {
                    fmt.Println("Error scanning row")
                }
            fmt.Printf("Table result is %v", result)
        }
    }
}

This has allowed me to query my Sqlite3 database using a dynamic table name which comes from elsewhere in my program (never as user input). Hope this helps someone though I understand this discussion is more tailored around larger DB systems. Who knows, it might work with them too!

cben commented 4 years ago

Reading this, I initially thought only "identifier.quoting" is non-portable. 'string quoting with '' doubling' should be 100% standard, going back at least to SQL-92 standard, right??

Alas, no. For example MySQL by default interprets backlash escapes inside 'strings', which violates the standard :facepalm:, unless one enables NO_BACKSLASH_ESCAPES. (compare to PostgreSQL doing it better, adding a new E'strings with escapes' syntax).

Still, I wonder — do all DBs have a set of flags that make them parse literals according to standard? If yes, maybe database/sql could offer a common interface to "give me a connection that respects standard syntax [as much as possible]"?

kardianos commented 4 years ago

@cben Nope, this isn't a thing. Many databases don't have such a setting, and I wouldn't recommend such a setting if there was one. Many people write SQL that make really important assumptions about quoting.

The interface I would recommend would be something like this: https://pkg.go.dev/github.com/golang-sql/sqlexp?tab=doc#Quoter

@David00 You're on the right track and for your purposes, that might be enough. But almost inevitably, if you are talking about escaping inputs, you'll want check for values within the string as well as placing quotes around it. Also note that SQLite escaping is different then other systems.


I'm honestly not sure the best direction for the quoter interface. Should there be a registry? Probably not. Could there be a per database quoter? Maybe. The problem is easy enough to frame. I'm not sure what the best solution would look like for this.

crussell52 commented 3 years ago

@kardianos

Maybe I'm being naive, but I would expect that quoting would have to be the responsibility of the driver have to be at the connection level to get it right. (MySQL being the obvious case in point.)

It would then be up to the sql package to provide an interface to obtain the quoter from the connection. An error return value allows for compatibility with drivers that have not implemented a quoter. Something like:

// Quoter provides a driver-appropriate utility for quoting values and identifiers.
//
// If the driver does not provide a Quoter, then a nil value is returned with a 
// non-nil error.
func (*sql.Conn) Quoter() (Quoter, error) { ... }

If I'm in a project and I feel good with the assumption that my driver of choice provides a Quoter, then I can just ignore the error case. If I want maximum abstraction, I write a few extra lines to handle that case.

paudley commented 2 years ago

For reference, here is the PGX Sanitize function for identifiers:

https://pkg.go.dev/github.com/jackc/pgx/v4#Identifier

jrfondren commented 2 years ago

I'm new to Go so my apologies if this is extraordinarily obvious to others, but I have managed to use Sprintf to create a string for the query containing a dynamic table name for Sqlite3. ... This has allowed me to query my Sqlite3 database using a dynamic table name which comes from elsewhere in my program (never as user input). Hope this helps someone though I understand this discussion is more tailored around larger DB systems. Who knows, it might work with them too!

We now have very short steps for someone to take to a disaster:

  1. someone tries using placeholders for column names and finds that this fails
  2. this person asks around, finds out that these need special quoting, and looks for this feature in database/sql--and doesn't find it
  3. this person searches the internet, finds this issue, scrolls down, aha! Here is helpful code to use.
  4. this person doesn't notice the bold never or thinks (wrongly) that it's fine
  5. this person has a trivial SQL injection vulnerability! yay!

This person isn't me, because I'm translating code using Perl's DBI and DBI did this right decades ago. We've have decades of trivial SQL vulnerabilities due to string construction of queries from untrusted input, and when placeholders are finally pretty common, people are still pooh-poohing other necessary measures with the exact same arguments used to say that placeholders weren't necessary? This is disgraceful.

kevin-lindsay-1 commented 1 year ago

I'm honestly kinda surprised that after 6 1/2 years there doesn't seem to be a standard way of escaping a user-input string in sql using go.

leandrocurioso commented 1 year ago

For: github.com/go-sql-driver/mysql

I couldn't stand to keep mexing quotes with + concatenation with fmt.Sprintf so I tried to create a standard.

I know this is far from perfect but I will share what I did and fixed for me:

Code

package repository

import (
    "fmt"
    "reflect"
    "strings"
)

type Repository struct{}

type BuildQueryParams struct {
    Placeholders []BuildQueryPlaceholder
}

type BuildQueryPlaceholder struct {
    Key    string
    Params []interface{}
}

// NewRepository get a new instance
func NewRepository() *Repository {
    return &Repository{}
}

func (r *Repository) BuildQuery(rawSql string, buildQueryParams BuildQueryParams) (string, []interface{}) {
    finalQuery := strings.ReplaceAll(rawSql, "%q", "`")
    finalParams := []any{}
    for _, placeholder := range buildQueryParams.Placeholders {
        pMask, params := r.getPlaceholders(placeholder.Params...)
        finalParams = append(finalParams, params...)
        finalQuery = strings.ReplaceAll(finalQuery, fmt.Sprintf(":%s", placeholder.Key), pMask)
    }
    return finalQuery, finalParams
}

func (r *Repository) SpreadSlice(slice any) []interface{} {
    s := reflect.ValueOf(slice)
    if s.Kind() != reflect.Slice {
        panic("InterfaceSlice() given a non-slice type")
    }

    // Keep the distinction between nil and empty slice input
    if s.IsNil() {
        return nil
    }

    ret := make([]interface{}, s.Len())

    for i := 0; i < s.Len(); i++ {
        ret[i] = s.Index(i).Interface()
    }
    return ret
}

func (r *Repository) getPlaceholders(values ...interface{}) (placeholders string, parameters []interface{}) {
    n := len(values)
    p := make([]string, n)
    parameters = make([]interface{}, n)
    for i := 0; i < n; i++ {
        p[i] = "?"
        parameters[i] = values[i]
    }
    placeholders = strings.Join(p, ",")
    return placeholders, parameters
}

Usage

sql, params := r.repository.BuildQuery(`
SELECT  
        ch.%qname%q,
    ch.client_username
FROM %qcharacter%q ch
INNER JOIN %qclient%q cl ON ch.client_username = cl.username
WHERE ch.%qname%q IN (:p1)
AND cl.client_status_id = :p2
AND ch.character_status_id = 1;`, BuildQueryParams{
Placeholders: []BuildQueryPlaceholder{
    {
        Key: "p1",
        Params: r.repository.BindSlice([]any{
            "character1", "character2", "character3",
        }),
    },
    {
        Key: "p2",
        Params: r.repository.BindSlice([]any{
            1,
        }),
    },
},
})

fmt.Println("Final SQL", sql)
fmt.Println("Final Params", params)

Output

Final SQL 
SELECT  
               ch.`name`,
               ch.client_username
FROM `character` ch
INNER JOIN `client` cl ON ch.client_username = cl.username
WHERE ch.`name` IN (?,?,?)
AND cl.client_status_id = ?
AND ch.character_status_id = 1;

Final Params [character1 character2 character3 1]

Query it

rows, err := connection.QueryContext(executionCtx, sql, params...)

The final sql will be replaced with "sql quotes" for each %q and the placeholders automatically added for each mapped key that starts with semi colon (:).

All placerholder parameters will be combined in sequence to be used for the prepared statement bind.

BindSlice function is used to generically pass any kind of slice to be appended to final parameters.

That's is!

ghost commented 1 year ago

For: github.com/go-sql-driver/mysql

I couldn't stand to keep mexing mixing quotes with + concatenation with fmt.Sprintf so I tried to create a standard.

I know this is far from perfect but I will share what I did and fixed for me:

Your solution does not escape input strings, its a workaround to being unable to use backticks in a multi-line literal.

This issue was opened because string replace isn't a proper solution. It makes the application vulnerable to SQL injection.

What we asking for is to add func (*sql.Conn) Escape(string) (string, err) which calls func (*driver.Driver) Escape(*sql.Conn, string) (string, err) which then does the correct driver-specific escaping by calling mysql_real_escape_string or similar.

Yes, it's possible to roll your own solution by following the docs but the whole point of the database/sql package is that users don't have to deal with driver-specific nonsense.

It's only been 7 years, there is no rush.

nefthy commented 1 year ago

There is also the issue, that quoting rules are database or even connection dependent, so if the drivers do not provide quoting functions, there is no way to write database agnostic code. I fail to see why the maintainers chose not to require drives to provide such functions. Database vendors do provide quoting functions in their C libraries, for good reasons.

ghost commented 11 months ago

In case you need an SQL escape method deadly:

It's Apache-2.0, Copy from: https://github.com/pingcap/tidb/blob/master/pkg/util/sqlexec/utils.go#L97

May want to use a permalink instead as code often changes over time. Regular links will eventually point to nowhere: https://github.com/pingcap/tidb/blob/eafb78a9c6b8ff6f9c00188ca16fc63b41ae4e56/pkg/util/sqlexec/utils.go#L97

This code doesn't appear to provide escaping but rather formats parameters as inline SQL which doesn't really solve anything and is prone to SQL injection. It is also using MySQL-specific syntax per readme.md of the project.

cben commented 4 months ago

What we asking for is to add func (*sql.Conn) Escape(string) (string, err) which ...

Nit: "Escape" sounds like it would do something like don'tdon''t which one might still concatenate with other pieces to form a longer "escaped string".
Such API invites passing around three levels of things — unescaped' stringsescaped'' string fragments"SQL'' fragments". Three is too many IMHO, too easy to mix those up an inject unescaped string directly into SQL.
Plus, we already know some DBs have different escaping & quoting rules for identifiers vs. string values, so even more distinct concepts like escaped""identifier...

I think we can skip escaped level and have a "QuoteString" API which simultaneously escapes and adds the quotes: don't'don''t'. So you only deal with two concepts: Go valuesSQL literal representation of those values.
[in an ideal world I'd wish for a polymorphic "LiteralValue" that can also serialize booleans, arrays of strings etc. but no idea if realistic.]

TLDR compare these two hypothetical styles:

-Sprintf("select myField from \"%s\" where myValue = '%s'", conn.EscapeIdentifier(tableName) conn.EscapeString(value))
+Sprintf("select myField from %s where myValue = %s", conn.QuoteIdentifier(tableName), conn.QuoteString(value))

the former might not even be portable due to hard-coded " and ' choices, but even if it was, I argue the latter is a cleaner API.
[I'm not saying either is great; there is a lot of type safety & ergonomics to add but those might belong in 3rd-party libraries(?) What we need from stdlib is minimum viable interface for DB drivers, and all I'm saying is "Quote" beats "Escape".]