Right now, GCP permissions/service accounts/setup is not documented.
We want klay_beam to be documented such that it is agnostic from a specific GCP project (i.e. klay-training and klay-beam-tests)
Job packages should not be open sourced. As a result, it is okay for job package READMEs to include things like the example invocation below, which has our project klay_beam_test and our service account dataset-dataflow-worker@klay-beam-tests.iam.gserviceaccount.com
# This kind of documentation is OK in job packages. It is not OK in the klay_beam package.
python bin/run_job_extract_chroma.py \
--project klay-beam-tests \
--service_account_email dataset-dataflow-worker@klay-beam-tests.iam.gserviceaccount.com \
--machine_type n1-standard-8 \
--region us-central1 \
--max_num_workers 50 \
--autoscaling_algorithm THROUGHPUT_BASED \
--runner DataflowRunner \
--experiments use_runner_v2 \
--sdk_location container \
--setup_file ./setup.py \
--temp_location gs://klay-dataflow-test-000/tmp/extract_chroma/ \
--source_audio_path 'gs://klay-dataflow-test-000/glucose-karaoke/' \
--job_name 'extract-chroma-test-000'
Right now, GCP permissions/service accounts/setup is not documented.
We want klay_beam to be documented such that it is agnostic from a specific GCP project (i.e.
klay-training
andklay-beam-tests
)Job packages should not be open sourced. As a result, it is okay for job package READMEs to include things like the example invocation below, which has our project
klay_beam_test
and our service accountdataset-dataflow-worker@klay-beam-tests.iam.gserviceaccount.com
charles-large-b
VM). We need to this to enable Zach to launch jobs.klay_beam.run_cuda_test
example, removing anything spciffic to our GCP project