GoogleCloudPlatform / bigquery-utils

Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
https://cloud.google.com/bigquery/
Apache License 2.0
1.07k stars 269 forks source link

Issue Accessing Custom JS Libraries specified in js_libs.yaml #337

Closed jonathan-telemetry closed 1 year ago

jonathan-telemetry commented 1 year ago

Hi,

I am trying to package more npm js libraries and use them in UDFs. I have modified js_libs.yaml to include the following:

compromise:
  versions:
    - 11.14.3
js-levenshtein:
  versions:
    - 1.1.6
jstat:
  versions:
    - 1.9.3
    - 1.9.4
languagedetect:
  versions:
    - 2.0.0
cld:
  versions:
    - 2.8.4

Then I was using cloud shell to deploy via export BQ_LOCATION=US-CENTRAL1 && bash deploy.sh

After that I tried testing that I could access the new npm libraries, but I didn't have the permissions.

The default ones (js-levenshtein, jstat, compromise) worked, but not the new ones.

$ gsutil ls gs://bqutil-lib/bq_js_libs/js-levenshtein-v1.1.6.js gs://bqutil-lib/bq_js_libs/js-levenshtein-v1.1.6.js $ gsutil ls gs://bqutil-lib/bq_js_libs/languagedetect-v2.0.0.min.js AccessDeniedException: 403 email@email.com does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).

For the new ones (languagedetect, cld) I am not able to list or copy the packaged js. And I am also not able to use it in place inside my Bigquery UDF.

Can you assist? Are there some permissions that need to be set so that I can read the new packages I request?

Also, is there a way to log the explicit path of any custom JS packages that are in the yaml file? I eventually found the code that specified that js-levenshtein should not be minimized, but it was not obvious why some were min.js and others were not.

Would also be helpful to have some logging to let the user know if the JS package name they entered was incorrect and that the package was not successfully retrieved from npm.

Thanks! Jonathan

danieldeleo commented 1 year ago

If you're modifying js_libs.yaml, you need to build from scratch in your own project using the cloudbuild.yaml file in udfs/ directory.

For example, from the udfs/ directory run the following:

gcloud builds submit . --substitutions _JS_BUCKET=gs://YOUR_BUCKET/PATH/TO/JS_LIBS,_BQ_LOCATION=US-CENTRAL1
jonathan-telemetry commented 1 year ago

Thanks! I just tried that command, which looks like it built the proper steps. However, I still got an error. See below.

(Note that earlier I tried the cloud build based on the README but that one didn't work gcloud builds submit . --config=deploy.yaml --substitutions _PROJECT_ID=abc,_BQ_LOCATION=US-CENTRAL1,_JS_BUCKET=gs://your_bucket/path/to/libs. The command you provided above started working, but then got the error)

Starting Step #1 - "install_npm_packages"
Step #1 - "install_npm_packages": Already have image (with digest): gcr.io/bqutil/bq_udf_ci
Step #1 - "install_npm_packages": npm ERR! code 1
Step #1 - "install_npm_packages": npm ERR! path /workspace/node_modules/fast-text-v1.0.2
Step #1 - "install_npm_packages": npm ERR! command failed
Step #1 - "install_npm_packages": npm ERR! command sh -c node-gyp rebuild
Step #1 - "install_npm_packages": npm ERR! make: Entering directory '/workspace/node_modules/fast-text-v1.0.2/build'
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/args.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/dictionary.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/fasttext.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/matrix.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/model.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/productquantizer.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/qmatrix.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/utils.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/lib/src/vector.o
Step #1 - "install_npm_packages": npm ERR!   CXX(target) Release/obj.target/fasttext/src/nodeArgument.o
Step #1 - "install_npm_packages": npm ERR! make: Leaving directory '/workspace/node_modules/fast-text-v1.0.2/build'
Step #1 - "install_npm_packages": npm ERR! gyp info it worked if it ends with ok
Step #1 - "install_npm_packages": npm ERR! gyp info using node-gyp@7.1.2
Step #1 - "install_npm_packages": npm ERR! gyp info using node@16.9.1 | linux | x64
Step #1 - "install_npm_packages": npm ERR! gyp info find Python using Python version 3.7.3 found at "/usr/bin/python3"
Step #1 - "install_npm_packages": npm ERR! gyp http GET https://nodejs.org/download/release/v16.9.1/node-v16.9.1-headers.tar.gz
Step #1 - "install_npm_packages": npm ERR! gyp http 200 https://nodejs.org/download/release/v16.9.1/node-v16.9.1-headers.tar.gz
Step #1 - "install_npm_packages": npm ERR! gyp http GET https://nodejs.org/download/release/v16.9.1/SHASUMS256.txt
Step #1 - "install_npm_packages": npm ERR! gyp http 200 https://nodejs.org/download/release/v16.9.1/SHASUMS256.txt
Step #1 - "install_npm_packages": npm ERR! (node:27) [DEP0150] DeprecationWarning: Setting process.config is deprecated. In the future the property will be read-only.
Step #1 - "install_npm_packages": npm ERR! (Use `node --trace-deprecation ...` to show where the warning was created)
Step #1 - "install_npm_packages": npm ERR! gyp info spawn /usr/bin/python3
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args [
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '/usr/lib/node_modules/npm/node_modules/node-gyp/gyp/gyp_main.py',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   'binding.gyp',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-f',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   'make',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-I',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '/workspace/node_modules/fast-text-v1.0.2/build/config.gypi',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-I',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '/usr/lib/node_modules/npm/node_modules/node-gyp/addon.gypi',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-I',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '/builder/home/.cache/node-gyp/16.9.1/include/node/common.gypi',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Dlibrary=shared_library',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Dvisibility=default',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Dnode_root_dir=/builder/home/.cache/node-gyp/16.9.1',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Dnode_gyp_dir=/usr/lib/node_modules/npm/node_modules/node-gyp',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Dnode_lib_file=/builder/home/.cache/node-gyp/16.9.1/<(target_arch)/node.lib',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Dmodule_root_dir=/workspace/node_modules/fast-text-v1.0.2',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Dnode_engine=v8',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '--depth=.',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '--no-parallel',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '--generator-output',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   'build',
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args   '-Goutput_dir=.'
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args ]
Step #1 - "install_npm_packages": npm ERR! gyp info spawn make
Step #1 - "install_npm_packages": npm ERR! gyp info spawn args [ 'BUILDTYPE=Release', '-C', 'build' ]
Step #1 - "install_npm_packages": npm ERR! ../lib/src/args.cc: In member function 'void fasttext::Args::parseArgs(const std::vector<std::__cxx11::basic_string<char> >&)':
Step #1 - "install_npm_packages": npm ERR! ../lib/src/args.cc:162:19: warning: catching polymorphic type 'class std::out_of_range' by value [-Wcatch-value=]
Step #1 - "install_npm_packages": npm ERR!      } catch (std::out_of_range) {
Step #1 - "install_npm_packages": npm ERR!                    ^~~~~~~~~~~~
Step #1 - "install_npm_packages": npm ERR! In file included from /builder/home/.cache/node-gyp/16.9.1/include/node/v8.h:30,
Step #1 - "install_npm_packages": npm ERR!                  from /builder/home/.cache/node-gyp/16.9.1/include/node/node.h:63,
Step #1 - "install_npm_packages": npm ERR!                  from ../src/nodeArgument.cc:8:
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h: In function 'void v8::internal::PerformCastCheck(T*)':
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h:489:38: error: 'remove_cv_t' is not a member of 'std'
Step #1 - "install_npm_packages": npm ERR!              !std::is_same<Data, std::remove_cv_t<T>>::value>::Perform(data);
Step #1 - "install_npm_packages": npm ERR!                                       ^~~~~~~~~~~
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h:489:38: note: suggested alternative: 'remove_cv'
Step #1 - "install_npm_packages": npm ERR!              !std::is_same<Data, std::remove_cv_t<T>>::value>::Perform(data);
Step #1 - "install_npm_packages": npm ERR!                                       ^~~~~~~~~~~
Step #1 - "install_npm_packages": npm ERR!                                       remove_cv
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h:489:38: error: 'remove_cv_t' is not a member of 'std'
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h:489:38: note: suggested alternative: 'remove_cv'
Step #1 - "install_npm_packages": npm ERR!              !std::is_same<Data, std::remove_cv_t<T>>::value>::Perform(data);
Step #1 - "install_npm_packages": npm ERR!                                       ^~~~~~~~~~~
Step #1 - "install_npm_packages": npm ERR!                                       remove_cv
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h:489:50: error: template argument 2 is invalid
Step #1 - "install_npm_packages": npm ERR!              !std::is_same<Data, std::remove_cv_t<T>>::value>::Perform(data);
Step #1 - "install_npm_packages": npm ERR!                                                   ^
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h:489:63: error: '::Perform' has not been declared
Step #1 - "install_npm_packages": npm ERR!              !std::is_same<Data, std::remove_cv_t<T>>::value>::Perform(data);
Step #1 - "install_npm_packages": npm ERR!                                                                ^~~~~~~
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8-internal.h:489:63: note: suggested alternative: 'perror'
Step #1 - "install_npm_packages": npm ERR!              !std::is_same<Data, std::remove_cv_t<T>>::value>::Perform(data);
Step #1 - "install_npm_packages": npm ERR!                                                                ^~~~~~~
Step #1 - "install_npm_packages": npm ERR!                                                                perror
Step #1 - "install_npm_packages": npm ERR! ../src/nodeArgument.cc: In member function 'v8::Local<v8::Object> NodeArgument::NodeArgument::mapToObject(std::map<std::__cxx11::basic_string<char>, std::__cxx11::basic_string<char> >)':
Step #1 - "install_npm_packages": npm ERR! ../src/nodeArgument.cc:211:7: warning: ignoring return value of 'v8::Maybe<bool> v8::Object::Set(v8::Local<v8::Context>, v8::Local<v8::Value>, v8::Local<v8::Value>)', declared with attribute warn_unused_result [-Wunused-result]
Step #1 - "install_npm_packages": npm ERR!        );
Step #1 - "install_npm_packages": npm ERR!        ^
Step #1 - "install_npm_packages": npm ERR! In file included from /builder/home/.cache/node-gyp/16.9.1/include/node/node.h:63,
Step #1 - "install_npm_packages": npm ERR!                  from ../src/nodeArgument.cc:8:
Step #1 - "install_npm_packages": npm ERR! /builder/home/.cache/node-gyp/16.9.1/include/node/v8.h:3961:37: note: declared here
Step #1 - "install_npm_packages": npm ERR!    V8_WARN_UNUSED_RESULT Maybe<bool> Set(Local<Context> context,
Step #1 - "install_npm_packages": npm ERR!                                      ^~~
Step #1 - "install_npm_packages": npm ERR! make: *** [fasttext.target.mk:150: Release/obj.target/fasttext/src/nodeArgument.o] Error 1
Step #1 - "install_npm_packages": npm ERR! gyp ERR! build error 
Step #1 - "install_npm_packages": npm ERR! gyp ERR! stack Error: `make` failed with exit code: 2
Step #1 - "install_npm_packages": npm ERR! gyp ERR! stack     at ChildProcess.onExit (/usr/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:194:23)
Step #1 - "install_npm_packages": npm ERR! gyp ERR! stack     at ChildProcess.emit (node:events:394:28)
Step #1 - "install_npm_packages": npm ERR! gyp ERR! stack     at Process.ChildProcess._handle.onexit (node:internal/child_process:290:12)
Step #1 - "install_npm_packages": npm ERR! gyp ERR! System Linux 5.10.0-18-cloud-amd64
Step #1 - "install_npm_packages": npm ERR! gyp ERR! command "/usr/bin/node" "/usr/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
Step #1 - "install_npm_packages": npm ERR! gyp ERR! cwd /workspace/node_modules/fast-text-v1.0.2
Step #1 - "install_npm_packages": npm ERR! gyp ERR! node -v v16.9.1
Step #1 - "install_npm_packages": npm ERR! gyp ERR! node-gyp -v v7.1.2
Step #1 - "install_npm_packages": npm ERR! gyp ERR! not ok
Step #1 - "install_npm_packages": 
Step #1 - "install_npm_packages": npm ERR! A complete log of this run can be found in:
Step #1 - "install_npm_packages": npm ERR!     /builder/home/.npm/_logs/2022-12-01T03_29_21_009Z-debug.log
Finished Step #1 - "install_npm_packages"
ERROR
ERROR: build step 1 "gcr.io/bqutil/bq_udf_ci" failed: step exited with non-zero status: 1

This is my js_libs.yaml file:

compromise:
  versions:
    - 11.14.3
js-levenshtein:
  versions:
    - 1.1.6
jstat:
  versions:
    - 1.9.3
    - 1.9.4
cld:
  versions:
    - 2.8.4
languagedetect:
  versions:
    - 2.0.0
fast-text:
  versions:
    - 1.0.2
danieldeleo commented 1 year ago

cld and fast-text are the problematic libraries. It builds successfully without those 2

Our build process uses webpack to create one js file for UDFs to reference, can you try and see if you can build those 2 libs using webpack outside our UDF build process. I think the issue stems from a compatibility issue between those libs and webpack

jonathan-telemetry commented 1 year ago

Thanks! I was able to get it to work after removing cld and fast-text. I haven't tried manually webpacking them, but will do so. Also, I may try to add those to the "minify" exception list to see if that might help. But I can add one at a time and see what breaks.

Appreciate the help here!

jonathan-telemetry commented 1 year ago

Am closing this issue. We were able to get the build working properly. (Note, we did test cld and it did not work to webpack it manually. There was some webpack error)