apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.81k stars 4.23k forks source link

[Bug]: Automatic model refresh notebook broken #28773

Closed damccorm closed 1 year ago

damccorm commented 1 year ago

What happened?

As I was working through the automatic model refresh notebook, I found the following bugs:

  1. Main session isn't saved correctly, leading to dependencies not being available at runtime. When save_main_session is specified, it fails because its not able to correctly pickle the file
  2. Relies on read_image function from an example, should just inline it
  3. Models are saved in unusable format and thus can't be loaded

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

damccorm commented 1 year ago
  1. It looks like the model updates aren't actually propagating through as well... e.g. https://pantheon.corp.google.com/dataflow/jobs/us-central1/2023-10-02_10_40_00-5726123037803609215
damccorm commented 1 year ago

Fixes I've gotten working so far:

  1. Main session isn't saved correctly, leading to dependencies not being available at runtime. When save_main_session is specified, it fails because its not able to correctly pickle the file
  1. Add save_main_session flag
  2. Update requirements.txt to use tensorflow_hub instead of tensorflow-hub
  3. Put the colab auth + dependency in a function and then invoke that function so that it doesn't get automatically imported when the main session is loaded.
  1. Relies on read_image function from an example, should just inline it

Inlined function

  1. Models are saved in unusable format and thus can't be loaded

Instead of downloading directly, for each model type do something like:

model = tf.keras.applications.resnet.ResNet152()
model.save('/path/to/model/resnet152_weights_tf_dim_ordering_tf_kernels.h5')
  1. It looks like the model updates aren't actually propagating through as well... e.g. https://pantheon.corp.google.com/dataflow/jobs/us-central1/2023-10-02_10_40_00-5726123037803609215

Not sure yet