Closed stephenplusplus closed 6 years ago
From @lukesneeringer on March 20, 2017 16:15
Hello @c0b. Thanks for reporting. This seems feasible. We will look into it soon.
From @c0b on March 21, 2017 1:47
for better management of different jobs, to use a customized jobid is also something needed, from listJobs
I can get the jobId(s) look like these, I can see bqjob_
are from bq command line, bquijob_
are from bigqueryUI, I want all queries from my nodejs app to start with some special naming pattern,
job_ZhgN2K65s_XT1Yq7ooMJhG0TGcQ
job_xAN6iV-dcfb1XnCbpHkOdJ542JY
bquijob_74a81cc0_15aea476e69
bquijob_3c27ca25_15aea468ab4
bquijob_50d0e7e0_15aea45b2dc
job_qGBPt_F45gotmov1sdKobAgTEjk
job_gqUriVZhciWVHrSokbQ2AGIkGXo
job_oRykICSs6Tc7gm4dOQdt3nyQSpc
bqjob_r1f987f469a3e2a33_0000015
bqjob_r56059b08bb57b6ee_0000015
from the REST Reference, both sync and async job can start query with a specified jobReference
with customized jobId
https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs
I have tried put a jobReference
object on the startQuery but it seems ignored?
// jobReference need to be on top level of POST body object for job insert REST API
jobReference: {
projectId: 'my-project',
jobId: 'myjob_' + Date.now().toString(36) + '_' + Math.random().toString(36).substr(2),
},
this feature request could be another issue if you want me to file a separate issue#
From @lukesneeringer on March 21, 2017 19:32
Yes, please file distinct issues for distinct problems. :-)
From @lukesneeringer on March 21, 2017 19:46
As best as I can tell, the issue is that the labels are not actual query parameters, but they live one level up.
We take arbitrary query parameters as part of options.params
here:
if (options.params) {
options.useLegacySql = false;
options.parameterMode = is.array(options.params) ? 'positional' : 'named';
if (options.parameterMode === 'named') {
// Redacted; not relevant.
} else {
options.queryParameters = options.params
.map(BigQuery.valueToQueryParameter_);
}
...and then it looks like this in a request:
// Create a job.
self.request({
method: 'POST',
uri: '/queries',
json: options
}, responseHandler);
queryParameters
is not the same as configuration
, which is what I see in the JSON block that the command line tool sent. I am a little afraid to just shift the labels up to be a peer of queryParameters
since I am not sure what the effect will be.
@jmuk Please advise.
From @jmuk on March 21, 2017 22:8
@lukesneeringer -- it seems you're referring query
method, but I think the topic here is about startQuery
; and indeed it only accepts query
property for configuration
as https://github.com/GoogleCloudPlatform/google-cloud-node/blob/master/packages/bigquery/src/index.js#L959
But anyhow, the point is about other properties of the configuration
like labels
will not be specified. I have little ideas how to specify labels
through options
-- it will depend on which of the other properties in the configurations
we want to allow. Does anyone have ideas?
If it's labels
specific, we can special-handle options.labels
.
It's probably better to ask @stephenplusplus and @callmehiphop for the help.
From @c0b on March 23, 2017 18:53
ok, so, on the line I made this local changes to make it work, for both #2107 and #2121 to move labels
and jobReference
to their correct location
➸ diff -U7 -rNp -- ./node_modules/@google-cloud/bigquery/src/index.js{.orig,}
--- ./node_modules/@google-cloud/bigquery/src/index.js.orig 2017-03-23 18:48:53.730426569 +0000
+++ ./node_modules/@google-cloud/bigquery/src/index.js 2017-03-23 18:25:54.893332089 +0000
@@ -957,14 +957,23 @@ BigQuery.prototype.startQuery = function
var body = {
configuration: {
query: extend(true, defaults, options)
}
};
+ if (body.configuration.query.labels) {
+ body.configuration.labels = body.configuration.query.labels;
+ delete body.configuration.query.labels;
+ }
+ if (body.configuration.query.jobReference) {
+ body.jobReference = body.configuration.query.jobReference;
+ delete body.configuration.query.jobReference;
+ }
+
this.request({
method: 'POST',
uri: '/jobs',
json: body
}, function(err, resp) {
if (err) {
callback(err, null, resp);
although a full PR should cover query and others as well.
From @lukesneeringer on March 23, 2017 20:12
I would almost be inclined to go the other direction -- iterate over the keys we know should be under configuration
and move them out into a configuration
dict. That approach would be more future-proof.
From @c0b on March 23, 2017 20:17
sure, that should be a better and ultimate solution, an interface like below?
startQuery(queryOptions, configurationOptions, postBodyOptions)
labels would on configurationOptions, and jobReference on the postBodyOptions ?
Let's force a string-first argument for the query, then an options object in the format of a configuration
as described here:
var options = {
destination: bigquery.dataset('higher_education').table('institutions'),
labels: [...],
query: {}
}
bigquery.startQuery('SELECT url FROM ....', options, callback)
options.destination
is the only custom property we have, so just remove it from options
and send everything else straight through, after assigning the query string to the right place. Not tested, but something like:
BigQuery.prototype.startQuery = function(query, options, callback) {
var self = this;
if (!query) {
throw new Error('A SQL query string is required.');
}
options = options || {};
var configuration = extend(true, {}, options);
delete configuration.destination;
var defaultConfiguration = {
query: {
query: query
}
};
if (options.destination) {
if (!(options.destination instanceof Table)) {
throw new Error('Destination must be a Table object.');
}
defaultConfiguration.query.destinationTable = {
datasetId: options.destination.dataset.id,
projectId: options.destination.dataset.bigQuery.projectId,
tableId: options.destination.id
};
}
var body = {
configuration: extend(true, defaultConfiguration, options)
};
this.request({
method: 'POST',
uri: '/jobs',
json: body
}, ...
};
From @lukesneeringer on March 27, 2017 21:59
@stephenplusplus The problem is that the labels should not live in the configuration
object.
Are you sure? That's where it is in the docs: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.labels
I thought the problem is it shouldn't live in configuration.query
.
From @lukesneeringer on March 27, 2017 22:4
You are completely right and I am completely wrong. :-)
From @c0b on April 1, 2017 8:27
before you release the fixed @google-cloud/bigquery
on npm, I'm using this diff as workaround for this #2107 #2150 and #2121 waiting your better approach
--- ./node_modules/@google-cloud/bigquery/src/index.js.orig 2017-03-23 11:48:53.730426569 -0700
+++ ./node_modules/@google-cloud/bigquery/src/index.js 2017-03-28 17:30:51.561750235 -0700
@@ -957,14 +957,27 @@ BigQuery.prototype.startQuery = function
var body = {
configuration: {
query: extend(true, defaults, options)
}
};
+ if (body.configuration.query.dryRun) {
+ body.configuration.dryRun = body.configuration.query.dryRun;
+ delete body.configuration.query.dryRun;
+ }
+ if (body.configuration.query.labels) {
+ body.configuration.labels = body.configuration.query.labels;
+ delete body.configuration.query.labels;
+ }
+ if (body.configuration.query.jobReference) {
+ body.jobReference = body.configuration.query.jobReference;
+ delete body.configuration.query.jobReference;
+ }
+
this.request({
method: 'POST',
uri: '/jobs',
json: body
}, function(err, resp) {
if (err) {
callback(err, null, resp);
That works, and this should work as well, so you don't have to modify the source code:
var bigquery = require('@google-cloud/bigquery')({...})
bigquery.interceptors.push({
request: function(reqOpts) {
if (reqOpts.uri === '/jobs') {
var body = reqOpts.json;
if (body.configuration.query.dryRun) {
body.configuration.dryRun = body.configuration.query.dryRun;
delete body.configuration.query.dryRun;
}
if (body.configuration.query.labels) {
body.configuration.labels = body.configuration.query.labels;
delete body.configuration.query.labels;
}
if (body.configuration.query.jobReference) {
body.jobReference = body.configuration.query.jobReference;
delete body.configuration.query.jobReference;
}
}
return reqOpts;
}
});
From @c0b on April 2, 2017 18:23
thanks for the advanced usage; I just read the Interceptors from https://googlecloudplatform.github.io/google-cloud-node/#/docs/google-cloud
@stephenplusplus I think that BigQuery#createJob
should resolve this issue. WDYT?
Yep, that sounds right to me!
From @c0b on March 19, 2017 22:6
The nodejs API is the most flexible (comparing to other language bindings like https://github.com/GoogleCloudPlatform/google-cloud-go/issues/554#issuecomment-286887199) that allows to pass in any key value pairs in the
configuration.query
object which may be in the REST API support but not in Client libraries, I worked around that in the past months for query parametershttps://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/0.8.0/bigquery?method=startQuery
now am researching the bq command line has a
label
feature, it seems useful to manage when there are too many load / query jobs, I want to tag them into different labels,But from the REST API Reference, the labels should be set on
configuration
object, so that it can apply toquery
load
copy
extract
all kinds of jobs; when I tried to passlabels: { k1: 'v1', ... }
to thestartQuery
it seems passed to within theconfiguration.query
object, then server side seems ignoring that, then the returned job doesn't come with any labelshttps://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.query
An example from bq tool:
This command line bq tool seems the only one support labels on job right now, I have researched all
GoogleCloudPlatform/google-cloud-node
GoogleCloudPlatform/google-cloud-go
GoogleCloudPlatform/google-cloud-python
python still needs GoogleCloudPlatform/google-cloud-python#2931 to add support for labels on dataset / tables, but NodeJS support labels on dataset / table by nature ofsetMetadata
method calling PATCH, doesn't need any special work in the client librariesBut for labels on Job, I believe the NodeJS client still require some changes to support; the Python and Go would require similar changes as well.
Copied from original issue: GoogleCloudPlatform/google-cloud-node#2107