deep-diver / semantic-segmentation-ml-pipeline

Machine Learning Pipeline for Semantic Segmentation with TensorFlow Extended (TFX) and various GCP products
https://blog.tensorflow.org/2023/01/end-to-end-pipeline-for-segmentation-tfx-google-cloud-hugging-face.html
Apache License 2.0
93 stars 20 forks source link

Create/host raw size input images #23

Closed deep-diver closed 2 years ago

deep-diver commented 2 years ago

Currently, we have reduced the size of original input images down to (256, 256) because of the limited capability of the default VM in Vertex Pipeline.

In order to overcome this issue, we should use Dataflow. Now, it is tested/verfied that ImportExampleGen component can delegate its job to Dataflow. So, we need to experiment if Dataflow can handle the bigger size of input images.

So, please write a script to make TFRecords with the raw size of images in a separate branch, host the generated TFRecords to GCS bucket. After that, I will create another branch to test the data with Dataflow.