shuttle-hq / synth

The Declarative Data Generator
https://www.getsynth.com/
Apache License 2.0
1.36k stars 105 forks source link

postgres encoding for `Value` fails when the "string" generator is a datetime #221

Open vlushn opened 2 years ago

vlushn commented 2 years ago

Describe the bug When a datetime is stored as a string column (e.g. varchar(10) then the insertion to Postgres fails with the cryptic message: invalid byte sequence for encoding "utf8": 0x00

This seems to happen because the impl Encode<'_, Postgres> for Value tries to encode a datetime as always a datetime parameter, and not a string, so the serialized value on the wire can include a null byte.

To Reproduce Steps to reproduce the behavior:

  1. Schema (if applicable)
"type": "string",
"date_time": {
  "format": "%d-%m-%Y",
  "subtype": "naive_date",
  "begin": "01-01-2000",
  "end": "01-01-2020"
}

Expected behavior

Expect that insertion will succeed (and be a string, according to the type field)

Additional context

This was tested before 0.6.0 so it might be fixed already.

The workaround is to run the above through a format:

{
  "format": {
    "format": "{date}",
    "arguments": {
      "date": {
        "type": "string",
        "date_time": {
          "format": "%d-%m-%Y",
          "subtype": "naive_date",
          "begin": "01-01-2000",
          "end": "01-01-2020"
        }
      }
    }
  }
}
christos-h commented 2 years ago

I fear that with the current state of sqlx we cannot avoid the admittedly cryptic error: invalid byte sequence for encoding "utf8": 0x00. I mean we may be able to add a bunch of machinery in synth to perform validation prior to insertion but this seems like the wrong approach.

I think @llogiq had looked into this at some point recently. Do you remember what the outcome was?

llogiq commented 2 years ago

I guess we wanted to split the date/time types from the formatting. With that, the format would produce a string node.