JerryX1110 / RPCMVOS

[AAAI22 Oral] Reliable Propagation-Correction Modulation for Video Object Segmentation
MIT License
78 stars 11 forks source link

Custom testing images - empty reference labels #9

Closed Student204161 closed 1 year ago

Student204161 commented 1 year ago

Hi there,

I have downloaded and trained RPCMVOS on YT-VOS19 but when I try to evaluate custom testing images I get errors & I have narrowed it down to how the reference label is defined.

By adding a new if-statement in eval_manager_rpa.py using the existing YOUTUBE_VOS_Test() I am able to run RPCMVOS on YT-VOS19 examples and on my own images but only if I replace the respective annotated image with a dummy annotated image from YT-VOS19. I am not sure how the code expects the annotated format to be....

The error I get is:

Traceback (most recent call last):
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 79, in <module>
    main()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
    evaluator.evaluating()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
    all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb, 
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
    tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 179, in before_seghead_process
    global_matching_fg = global_matching_for_eval(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/layers/matching.py", line 282, in global_matching_for_eval
    reference_labels_flat = reference_labels.view(-1, obj_nums)
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous

When I debug, reference_labels turns out to be an empty tensor. I have tried turning my annotated image into a RGB-image, which I am not sure I did correctly, but then I get this error:

    main()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/tools/eval_rpa.py", line 76, in main
    evaluator.evaluating()
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/engine/eval_manager_rpa.py", line 235, in evaluating
    all_pred, current_embedding,memory_cur_list = self.model.forward_for_eval(memory_prev_all_list[aug_idx], ref_emb, 
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 85, in forward_for_eval
    tmp_dic, _ ,memory_cur_list= self.before_seghead_process(
  File "/zhome/a7/0/155527/Desktop/Bachelor-Project/VOS_models/RPCMVOS/./networks/rpcm/rpcm.py", line 176, in before_seghead_process
    seq_ref_frame_label = seq_ref_frame_label.squeeze(1).permute(1,2,0)
RuntimeError: permute(sparse_coo): number of dimensions in the tensor input does not match the length of the desired ordering of dimensions i.e. input.dim() = 4 is not equal to len(dims) = 3

My annotated image: 00000 My annotated image after RGB: 00000 (1)

This leads me to ask about how the annotated image format is expected to be?

Thanks for your attention.

Student204161 commented 1 year ago

I figured out my problem. The annotating tool that I used returned a greyscale image (1-channel) which it should do, but it didn't return a monotone mask (if inspecting the PNG I linked above, there are grey pixels which there shouldn't be). By giving all non-zero pixels the value 1, the model works.

JerryX1110 commented 1 year ago

Hi, I think you are trying to format your custom dataset into YouTubeVOS format for evaluation, my suggestions are as follows:

JerryX1110 commented 1 year ago

I figured out my problem. The annotating tool that I used returned a greyscale image (1-channel) which it should do, but it didn't return a monotone mask (if inspecting the PNG I linked above, there are grey pixels which there shouldn't be). By giving all non-zero pixels the value 1, the model works.

Glad to know that you have solved the problem.