Custom testing images - empty reference labels

Student204161 commented 1 year ago

Hi there,

I have downloaded and trained RPCMVOS on YT-VOS19 but when I try to evaluate custom testing images I get errors & I have narrowed it down to how the reference label is defined.

By adding a new if-statement in eval_manager_rpa.py using the existing YOUTUBE_VOS_Test() I am able to run RPCMVOS on YT-VOS19 examples and on my own images but only if I replace the respective annotated image with a dummy annotated image from YT-VOS19. I am not sure how the code expects the annotated format to be....

The error I get is:

Traceback (most recent call last):
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 79, in <module>
    main()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
    evaluator.evaluating()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
    all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb, 
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
    tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 179, in before_seghead_process
    global_matching_fg = global_matching_for_eval(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/layers/matching.py", line 282, in global_matching_for_eval
    reference_labels_flat = reference_labels.view(-1, obj_nums)
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous

When I debug, reference_labels turns out to be an empty tensor. I have tried turning my annotated image into a RGB-image, which I am not sure I did correctly, but then I get this error:

    main()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
    evaluator.evaluating()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
    all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb, 
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
    tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 176, in before_seghead_process
    seq_ref_frame_label = seq_ref_frame_label.squeeze(1).permute(1,2,0)
RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 4 is not equal to len(dims) = 3

My annotated image: 00000 My annotated image after RGB: 00000 (1)

This leads me to ask about how the annotated image format is expected to be?

Thanks for your attention.

Student204161 commented 1 year ago

I figured out my problem. The annotating tool that I used returned a greyscale image (1-channel) which it should do, but it didn't return a monotone mask (if inspecting the PNG I linked above, there are grey pixels which there shouldn't be). By giving all non-zero pixels the value 1, the model works.

JerryX1110 commented 1 year ago

Hi, I think you are trying to format your custom dataset into YouTubeVOS format for evaluation, my suggestions are as follows:

A binary mask in your case only includes 0 and 1. However, as the number of objects in the YouTubeVOS dataset may be larger than 2, the value of each pixel of the annotated image can be a value larger than 2. That is to say, each pixel of the annotated image is expected to be a value >=0, e.g., the value can be in {0,1,2} if there are two foreground objects at most (here 0 indicates the background). For this point, I think you can have a deeper look at the annotation of YouTubeVOS before evaluating your custom dataset.
Specifically, you can print out the value of the annotated image in the read_label function ( ./dataloaders/datasets.py#L450) for the evaluation on YouTubeVOS. Then, I think you can understand the annotation format of YouTubeVOS better. Afterward, you can transform the annotation format of your custom annotated image into the YouTube-VOS format.
If you have new problems or questions, feel free to ask me. : )

JerryX1110 commented 1 year ago

I figured out my problem. The annotating tool that I used returned a greyscale image (1-channel) which it should do, but it didn't return a monotone mask (if inspecting the PNG I linked above, there are grey pixels which there shouldn't be). By giving all non-zero pixels the value 1, the model works.

Glad to know that you have solved the problem.

JerryX1110 / RPCMVOS

Custom testing images - empty reference labels #9