googlecodelabs / iot-data-pipeline

Apache License 2.0
18 stars 7 forks source link

Query of usage of insertErrors logic ... #10

Closed nkolban closed 6 years ago

nkolban commented 6 years ago

I am studying the tutorial in depth and notice the following logic:

  bigquery
    .dataset(datasetId)
    .table(tableId)
    .insert(rows)
    .then((insertErrors) => {
      console.log('Inserted:');
      rows.forEach((row) => console.log(row));

      if (insertErrors && insertErrors.length > 0) {
        console.log('Insert errors:');
        insertErrors.forEach((err) => console.error(err));
      }
    })

When I run the tests, I find that the then of the promise is called and insertErrors is returned containing the following object:

[{"kind":"bigquery#tableDataInsertAllResponse"}]

My function is working (the data is being inserted into the table).

I have a suspicion something is amiss. I did some background searches and found the following:

https://github.com/GoogleCloudPlatform/google-cloud-node/issues/2149

Might it be the case that the example is out of step with the reality of the current GCP APIs?

I get the impression that the return from an insert is now as documented:

https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/python/latest/bigquery_v2.tabledata.html

(Note the previous link is Python while I am using Node)

sunsetmountain commented 6 years ago

I'm not totally sure why the successful insert is triggering error logic when it seems that the insertErrors array is 0. For now, this workaround eliminates the logging of a InsertAllResponse and cleans up the logging format a bit as well...

  // Inserts data into a table
  bigquery
    .dataset(datasetId)
    .table(tableId)
    .insert(rows)
    .then((insertErrors) => {
      rows.forEach((row) => console.log('Inserted: ', row));

      if (insertErrors && insertErrors.length > 0) {
        insertErrors.forEach((err) => {
            if (err.name === 'PartialFailureError') {
              // Insert partially, or entirely failed
              console.log('PartialFailureError: ', err);
            } else {
              // `err` could be a DNS error, a rate limit error, an auth error, a successful insert, etc.
              if (err.kind === 'bigquery#tableDataInsertAllResponse') {
                //do nothing
              } else {
                console.log('OtherError: ', err);
              }
            }
        })
      }
    })
    .catch((err) => {
      console.error('ERROR:', err);
    });
  // [END bigquery_insert_stream]
nkolban commented 6 years ago

Howdy my friend. I'm extremely green on GCP functions so please forgive me if I am all over the place.

If I were to make a guess, when we look at the call to BigQuery table.insert()

My belief (based on experimentation with the APIs in Functions in Node.JS) is that the object passed through in the Promise resolution looks like that described here in the REST page:

https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll

Now ... IF that is correct ... and again, I am not swearing it is ... only my tests seem to show that to be the case ...

Then when we code:

  bigquery
    .dataset(datasetId)
    .table(tableId)
    .insert(rows)
    .then((response) => {

we get back an array that contains structures that looks like:

{
  "kind": "bigquery#tableDataInsertAllResponse",
  "insertErrors": [
    {
      "index": unsigned integer,
      "errors": [
        {
          "reason": string,
          "location": string,
          "debugInfo": string,
          "message": string
        }
      ]
    }
  ]
}

I think we may have two areas to examine. The first is that because we are inserting an array of rows, the insert() call is being translated to an insertAll() call which means that we don't get one response but an array of responses that are are ordinal indexed to the rows being added. The second puzzle is that the structure of the event appears different to what we have currently code.

Again, MANY thanks for your super quick attention to this area and apologies if I am not being clear. Let me know if I can perform any tests or other work for you.

sunsetmountain commented 6 years ago

It appears that, with a successful insert, we don't get the insertErrors array portion of the structure returned and only...

[{"kind":"bigquery#tableDataInsertAllResponse"}]

Therefore, checking for whether the insertErrors array exists at all is key for detecting whether there are errors to log. The updated code looks like...

  // Inserts data into a table
  bigquery
    .dataset(datasetId)
    .table(tableId)
    .insert(rows)
    .then((foundErrors) => {
      rows.forEach((row) => console.log('Inserted: ', row));

      if (foundErrors && foundErrors.insertErrors != undefined) {
        foundErrors.forEach((err) => {
            console.log('Error: ', err);
        })
      }
    })