klay-music / klay-beam

Our Apache Beam Transforms and Pipelines
1 stars 0 forks source link

Migrate discogs effnet #49

Closed CharlesHolbrow closed 1 year ago

CharlesHolbrow commented 1 year ago

Update job_discogs_effnet to pin klay-beam (without pytorch) and use modern v1.0 style docker container by default (when running via Dataflow). Ensure that the job has:

  1. a default docker container (no need to specify --sdk_container_image)
  2. suitable launch+dev environment in <job_dir>/environment/dev.yml
  3. pinned and compatible version of klay_beam and apache_beam that match the default docker container
  4. valid example launch invocations in their README

On non-trivial change for this job is the new docker build process. We make a custom docker image, but do it "FROM" a klay_beam:bla-blah image. This results in a MUCH simpler local Dockerfile