tensorflow / models

Models and examples built with TensorFlow
Other
77.21k stars 45.75k forks source link

Tensorboard Eval Images with TF-Vision #11270

Open RayanMoarkech opened 1 month ago

RayanMoarkech commented 1 month ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

1. The entire URL of the file you are using

https://www.tensorflow.org/tfmodels/vision/object_detection#load_logs_in_tensorboard

2. Describe the bug

I am following this documentation, https://www.tensorflow.org/tfmodels/vision/object_detection#load_logs_in_tensorboard When I open tensorboard, and select images, I get "No image data was found."

I also tried to add EXPERIMENT_CONFIG.task.allow_image_summary = True, but I got an error, even with the dataset and code given by the documentation.

The error:

ValueError: Expected scalar shape, saw shape: (1, 640, 640, 3).

The code:

model, eval_logs = tfm.core.train_lib.run_experiment(
    distribution_strategy=distribution_strategy,
    task=task,
    mode='train_and_eval',
    params=EXPERIMENT_CONFIG,
    model_dir=paths['MODEL_CHECKPOINT_PATH'],
    run_post_eval=True,
)

3. Steps to reproduce

Now, try to train again with

Screenshot 2024-10-18 at 1 52 38 AM

4. Expected behavior

I would like to see the evaluated images per epochs saved on tensorboard.

5. Additional context

Let me know if you need anything extra

6. System information

bharatjetti commented 3 days ago

Hi @RayanMoarkech I worked on the problem and reproduced the issue of No image data was found, I added piece of code to the existing, i.e in the show_batch function added this line and made necessary changes.

with summary_writer.as_default(): tf.summary.image(f'Image_with_bboxes_{i+1}', np.expand_dims(image, axis=0), step=train_steps)

and I found that it is working, here is the notebook that I worked on. Please check it here is the screenshot.

Screenshot 2024-11-19 at 2 07 29 PM
RayanMoarkech commented 3 days ago

But this will not create an image log at every summary interval while training the model with:

tfm.core.train_lib.run_experiment

Correct me if I'm wrong. But I was not able to connect the training to produce an image summary at the same time I am doing a summary_interval. So this means this option is only a manual code that I should run on every train step I want to stop at?

Based on your screenshot, you can see the data is from a . RUN Screenshot 2024-11-19 at 1 55 21 PM