Closed alex-statsig closed 1 month ago
hey @alex-statsig, thanks for the report.
For question 1:
For the type returned by the query
method, for historical/backward compatibility reasons, the method signature shows that it can return both a Promise<QueryRowsResponse>
(which has 3 positional parameters) or Promise<SimpleQueryRowsResponse>
(which has only 2), and the signature has been the same for a while as far as I checked here with git blame.
It is indeed a bit confusion and it can either contain a Query
configuration, which contains a IJob
(return type is QueryRowsResponse
) or it can contains just a Job
directly (SimpleQueryRowsResponse
). The more usual is going to be the QueryRowsResponse
like you are seeing. SimpleQueryRowsResponse
is returned on dryRun
queries and when we have an error creating a query job. Again for historical reason has been like that and I wanted to remove SimpleQueryRowsResponse
all together, but I was afraid of someone depending on that for error handling. Maybe I'll do that in the next coming breaking change scheduled for the next node version upgrade.
Weirdly enough, the 2 or 3 positional returned parameters was broken for some time, until we fixed it on PR #1365. You can see the integration test for the 3 positional parameters return with the query
method here: https://github.com/googleapis/nodejs-bigquery/pull/1365/files#diff-5e68f218ffbdd4cae3bcb93aa62c0515f1fedff0f6c4fe40986d6ee9932b08a0R346
For question 2:
I'm going to investigate the timeout further here to make sure if is working properly. But to give more context, for jobTimeoutMs
: the jobs.query
endpoint basically does a job creation (or not with stateless queries) and get the query results on the same call. But sometimes that query job can take a longer time to run, which is where the timeoutMs
configuration comes in, where it controls how long you want to way for the query job to be done, before the client side has to fallback to jobs.getQueryResults
. And in that path, we are not allowed to control the BQ job timeout. The BQ Job timeout is how long do you want the job run before it is cancelled and this is not even precise, because as it's a distributed system, cancelling the query can take a bit longer than the configured timeout.
You can see details on jobTimeoutMs
here in the docs: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job
and timeoutMs
for jobs.query
here: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query#queryrequest
But I'll investigate it here, because maybe the call to jobs.query
it's indeed getting timed out, but if auto pagination is on, we are fallback'ing to fetch the results using jobs.getQueryResults
. Which is a similar scenario that you reported that might be happening.
Thanks so much for the detailed/quick reply!
For question 1: ...
Got it, yeah I mostly wanted to understand exactly what types I should expect (i.e. are SimpleQueryRowsResponse
and QueryRowsResponse
both expected viable outputs I should handle).
One further question on this though, the types indicate QueryRowsResponse
will be returned if query
is a string, whereas SimpleQueryRowsResponse
is returned when query is a Query
object. In my case, I'm passing in a Query
object but getting back 3 positional arguments (what seems to be a QueryRowsResponse
). So it's not just that the types are ambiguous, they seem to actually be incorrect from what I can tell.
In general my goal is to log statistics about the query after running it, but it's very confusing figuring out where I should grab the information needed to lookup the job statistics. I currently settled on something like const [rows, poorlyTypedJob] = await client.query(queryObj, options)
, followed by const realJob = (poorlyTypedJob as any).job ?? poorlyTypedJob;
(since the types indicate poorlyTypedJob
will be IJob
, but in practice I'm seeing the job under poorlyTypedJob.job
). I don't feel comfortable using such a hack in production though, but maybe I can just cast the original response to SimpleQueryRowsResponse | QueryRowsResponse
(since you seem to be saying that's the expected return shape) and ensure I handle both of those correctly.
And in that path, we are not allowed to control the BQ job timeout.
Yeah my main concern is there's no way to control costs with this function (i.e. a runaway query could keep running for a while right?). But I guess this is independent of this driver, and more related to the underlying feature. I'm unsure why this would be still, but maybe because "it's a distributed system" it requires the overhead of a stateful job to do that?
But I'll investigate it here, because maybe the call to jobs.query it's indeed getting timed out, but if auto pagination is on, we are fallback'ing to fetch the results using jobs.getQueryResults. Which is a similar scenario that you reported that might be happening.
Yeah I think this is what I've observed; it seemed to me like "jobComplete" was false in the initial request due to the timeout, but then it automatically retried without the timeoutMs (logic here)
the types indicate QueryRowsResponse will be returned if query is a string, whereas SimpleQueryRowsResponse is returned when query is a Query object
This is not the expected behavior, you should get a QueryRowsResponse
regardless of using a Query
as object or just a SQL string
. Do you have a repro case where you see a SimpleQueryRowsResponse
that is not an error case like I described before ?
In general my goal is to log statistics about the query after running it
If you planning to use JobStatistics, you should not rely on the query
method, as in some cases, there would not be even a BQ job, like in a case we have a stateless query ( for now is in preview, but we are planning to make it the default behavior). This sample would the right way to always have a Job: https://github.com/googleapis/nodejs-bigquery/blob/main/samples/query.js
This is not the expected behavior, you should get a
QueryRowsResponse
regardless of using aQuery
as object or just a SQLstring
The types here indicate otherwise though right? There is an explicit overload saying if you pass Query
, the return type will be SimpleQueryRowsResponse
(not QueryRowsResponse
). If the type was SimpleQueryRowsResponse | QueryRowsResponse
that would be fine, but it seems explicitly incorrect to me
It seems like I am getting QueryRowsResponse
for a Query object, so I think you're correct about what is actually returned
If you planning to use JobStatistics, you should not rely on the query method, as in some cases, there would not be even a BQ job, like in a case we have a stateless query ( for now is in preview, but we are planning to make it the default behavior). This sample would the right way to always have a Job: https://github.com/googleapis/nodejs-bigquery/blob/main/samples/query.js
Got it, my understanding was you could still look up statistics for a jobless query (ex. the docs mention "Instead, it returns a queryId field which you can use to get insights about the query using the INFORMATION_SCHEMA.JOBS view"). I don't need JobStatistics, but if at all possible for cost-monitoring purposes I would like to track the slot usage from my application still. And it does seem like I'm able to do that still with client.query()
and QUERY_PREVIEW_ENABLED
but maybe I'm mis-using something
edit: seems like the particular query I was testing must have not been eligible due to temp tables being used (I was testing a more expensive one to see how timeouts behaved), I see jobCreationReason "other". Testing on a simpler query, the jobstatistics doesn't work with the queryID.
hey @alex-statsig, I got a bit busy this week with other things and could not check yet the timeout issue yet, but I'll try to get to it today as early next week.
Got it, my understanding was you could still look up statistics for a jobless query (ex. the docs mention "Instead, it returns a queryId field which you can use to get insights about the query using the INFORMATION_SCHEMA.JOBS view"). I don't need JobStatistics, but if at all possible for cost-monitoring purposes I would like to track the slot usage from my application still. And it does seem like I'm able to do that still with client.query() and QUERY_PREVIEW_ENABLED but maybe I'm mis-using something
I think you're right, but I need to double check that, I think queryId
were not showing up yet on that view, but I'll confirm that too.
hey @alex-statsig I've just opened https://github.com/googleapis/nodejs-bigquery/pull/1402 to fix the timeout issue when using jobs.query
. I asked for some internal reviews, but we should get this out this week.
Very excited by https://github.com/googleapis/nodejs-bigquery/pull/1337, but ran into two main issues:
client.query(query, options)
(andclient.query(query)
). The types indicate the return type is[ResultsArray, IJob]
(note in particular that job is not nested under a key "job" for the second value), whereas in practice I'm seeing a return shape of[ResultsArray, {job: IJob}, JobStatistics]
. I'm unsure if the signature will change, so would like to make sure the types/return value is correct before using thistimeoutMs
or some duration near that (perhaps RTT + timeoutMs). In reality, I'm observing it last for much longer if the job takes a while. It seems like the timeout is used for the initial jobsQuery request, then it is removed and the job continues to be awaited. This is critical to having some safety net for the maximum duration the fetch will take (jobTimeoutMs would be the preferred mechanism though).Environment details
@google-cloud/bigquery
version: 7.8.0Steps to reproduce
QUERY_PREVIEW_ENABLED
env variable. I believe the issues reproduce without this too though For the first bug,client.query(jobOptions)
(such asclient.query({query: '<sql>'})
)client.query(jobOptions, { timeoutMs: 1000 })
with a job that would take >10s to execute.Very excited to try out these preview features, but these are the issues blocking me from actually using it currently