lib / pq

Pure Go Postgres driver for database/sql
https://pkg.go.dev/github.com/lib/pq
MIT License
8.89k stars 906 forks source link

Can't use non-latin symbols with pq.CopyIn for jsonb type #1023

Open wuzzapcom opened 3 years ago

wuzzapcom commented 3 years ago

Description

I'm implementing bulk insert using pq.CopyIn feature. One of column is jsonb type and it contain non-latin symbols, for example:

{ "text": {
  "title": "название"
  }
}

This json can be inserted with regular INSERT. When I've tried to wrap the code with pq.CopyIn, I got the following error:

invalid input syntax for type json

When I insert same json with latin-only strings, all works properly.

code example

func bulk() {
    tx, err := db.Begin()

    stmt, err := tx.Prepare(pq.CopyIn(
        "example", "payload_jsonb",
    ))

    stmt.Exec(Payload{Title: "название"})

        stmt.Exec()
}

what I want to see

Inserted successfully

what I get

invalid input syntax for type json

Same code with Payload{Title: "title"} inserts successfully.

Research

I've discovered that this issue happens because input in pq.CopyIn mode is encoded as text using appendEncodedText. So cyrillic symbols are encoded like this:

{
    "text": {
        "title":"\321\202\320\260\320\271\321\202\320\273"
     }
}

Apparently PostgreSQL does not accept such json and returns the error. This is the reason why no issue appears on same json with latin value of the field title.

On the other hand I've checked how regular INSERT handle this case. It uses encode. So json is just not encoded with encodeBytea.

Are there any workarounds for this issue?

lragnarsson commented 3 years ago

Ran into the same problem, did you find any workarounds? @wuzzapcom

wuzzapcom commented 3 years ago

Unfortunately, I didn’t

drrossum commented 2 years ago

I seem to run into the same problem. I bisected this to commit 4a7d9870ef6a5a247ef532192988b01bf13cc110

When I revert this commit from the latest release this works again.

Does this solve your problem, too?