googleapis / nodejs-bigquery-storage

BigQuery Storage Node.js client
Apache License 2.0
34 stars 17 forks source link

Adapt: Serializing DATE field breaks #344

Closed solamanhuq closed 8 months ago

solamanhuq commented 1 year ago

At the moment, if I use adapt to serialize JSON with a DATE field as a string, e.g. 2023-01-01, the serialization will convert it to an INT value 0. Converting it to an epoch will allow it to serialize, but then the write API fails it with Could not convert value to date..

I'm not sure what the expectation is, but as of now there is an incompatibility between the client and the storage write API.

EDIT: It appears that if I pass in number of days since 1970-01-01, it will successfully serialize and save in storage write API.

bit unfortunate that the type is restricted; if strings worked for streaming insert it'd be nice to work for the storage write API. but all the same this is something we can live with.

Environment details

Steps to reproduce

  1. Have a table with a DATE field
    await bigquery.dataset('mydataset')
    .createTable('mytable',
    {
    "schema": {
      "fields": [
        {
          "name": "coolthing",
          "type": "STRING"
        },
        {
          "name": "date",
          "type": "DATE"
        },
        {
          "name": "ts",
          "type": "DATETIME"
        }
      ]
    },
    "timePartitioning": {
      "type": "DAY",
      "expirationMS": "7776000000",
      "field": "date"
    }
    })
  2. Use adapt and protojs to write JSON events to table:
    
    const dataset = bigquery.dataset('mydataset')
    const table = await dataset.table('mytable')
    const [metadata] = await table.getMetadata()
    const storageSchema = adapt.convertBigQuerySchemaToStorageTableSchema(
    metadata.schema
    )
    const protoDescriptor = adapt.convertStorageSchemaToProto2Descriptor(
    storageSchema,
    tableId
    )
    const protoType = Type.fromDescriptor(protoDescriptor)

const events = [{ coolthing: 'test close goal', date: '2023-01-01', ts: '2023-01-01T06:24:56', }] const serializedRows = events.map(event => protoType.encode(event).finish() )

// error: date field is now 0 const deserializedRows = serializedRows.map(event => protoType.decode(event))

const stream = bigQueryWriteClient.appendRows( // open connection to stream )

// throws 'error' event stream.write({ //something soemthing serialized rows })


  5. This fails. It appears that the `adapt` and `protojs` libaries expect DATE to be formatted as an int. However if we convert it to an epoch then the API _still_ throws an error: `Could not convert value to date.`

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!
brantian commented 1 year ago

Is there any possible workaround for this issue while it isn't fixed?

solamanhuq commented 1 year ago

Is there any possible workaround for this issue while it isn't fixed?

@brantian I mentioned a workaround in the description:

EDIT: It appears that if I pass in number of days since 1970-01-01, it will successfully serialize and save in storage write API.

So, if you take a string and convert it to an int (as in days), it will work as intended.

alvarowolfx commented 8 months ago

422 adds proper support for JS Date class