jackc / pgx

PostgreSQL driver and toolkit for Go
MIT License
10.75k stars 838 forks source link

pgx5 connection CopyFrom failing for JSON field where value is string #1922

Open EfraimDim opened 8 months ago

EfraimDim commented 8 months ago

Hello, I am trying to using CopyFrom with pgx5 connection to copy from a table with a JSON field but the json is inserted in the original insert query as '"json_string"'. When using CopyFrom with rows pgx5.Rows and a JSON value of this string, it throws an error: ERROR: invalid input syntax for type json (SQLSTATE 22P02)

Steps to reproduce the behavior:

package main

import (
    "context"
    "fmt"
    "github.com/golang/glog"
    "github.com/jackc/pgx/v5"
    "log"
    "os"
    "strings"
)

func main() {
    var err error

    conn, err := pgx.Connect(context.Background(), os.Getenv("SOURCE_DATABASE_URL"))
    if err != nil {
        glog.Errorln("Error connecting to db: %v", err)
    }
    defer conn.Close(context.Background())

    _, err = conn.Exec(context.Background(), "CREATE TABLE testtable (id SERIAL PRIMARY KEY, json_column JSON)")
    if err != nil {
        glog.Errorln("Error creating table: %v", err)
    }

    _, err = conn.Exec(context.Background(), `INSERT INTO testtable (json_column) VALUES ('"json_string"')`)
    if err != nil {
        glog.Errorln("Error inserting data: %v", err)
    }

    if rows, err = conn.Query(context.Background(), "SELECT * FROM testtable"); err != nil {
        glog.Errorf("Error querying data from testtable: %v", err)
    }
    defer rows.Close()

    noOfColumns := len(rows.FieldDescriptions())

    columns := make([]string, noOfColumns)
    for i, col := range rows.FieldDescriptions() {
        columns[i] = string(col.Name)
    }

    table := "public.testtable"

    schemaTable := strings.SplitN(table, ".", 2)

    connTarget, _ := pgx.Connect(context.Background(), os.Getenv("TARGET_DATABASE_URL"))
    defer connTarget.Close(context.Background())

    _, err = connTarget.Exec(context.Background(), "CREATE TABLE testtable (id SERIAL PRIMARY KEY, json_column 
        JSON)")
    if err != nil {
        glog.Errorln("Error creating table: %v", err)
    }

    if _, err = connTarget.CopyFrom(context.Background(), schemaTable, columns, rows); err != nil {
        glog.Errorln("Error executing COPY SQL for table", err)
    }
}

Version

**Edited code: 1/03/24

jackc commented 8 months ago

The problem is unmarshalling the literal JSON string "foo" into a Go string removes the quotation marks. This is correct behavior.

When encoding a string to a PostgreSQL json type, pgx assumes that it is an encoded JSON document, not a literal string that should be encoded into a JSON value. This is also correct behavior. Or at least the behavior that we would almost always want when using a string as a query parameter.

Unfortunately, these two (correct) behaviors are incompatible.

I don't think there is a general solution that would allow using a pgx.Rows that has JSON documents that consist entirely of a JSON string value as a pgx.CopyFromSource. It might be possible to do a special case, but actually the entire approach of using pgx.Rows as a pgx.CopyFromSource is sub-optimal.

Instead you can directly wire CopyTo and CopyFrom together. See https://github.com/jackc/pgx/issues/867.