hkchengrex / Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation
https://hkchengrex.com/Tracking-Anything-with-DEVA/
Other
1.14k stars 120 forks source link

mask problem #101

Open HibaDoi opened 1 week ago

HibaDoi commented 1 week ago

I'm using the evaluation method that involves masking images and using a .json file to track objects. However, I'm facing an issue where the masks I initially apply are not the same as those returned afterward. There's a significant decline in quality, and sometimes the masks don't appear at all. Do you have any suggestions?

python evaluation/eval_with_detections.py --mask_path C:/Workflow_hiba/3_Tracking/source --img_path C:/Workflow_hiba/3_Tracking/images --dataset demo --temporal_setting semionline --output C:/Workflow_hiba/3_Tracking/output222 --chunk_size 1

hkchengrex commented 1 week ago

Can you be more specific? If the input masks are inconsistent, the voting algorithm in the semi-online setting might rule them out.

HibaDoi commented 1 week ago
  1. Despite inputing a precise mask for the object I wish to track in successive frames, the segmentation performed by Deva produces masks of poor quality.
  2. Sometimes, there are two masks visible in the initial image, but only one mask is present in the output.
  3. In some instances, although the input shows two masks, the resulting output mask contains only one object that appears split into two parts, completely disregarding the second object.
  4. Occasionally, objects are improperly tracked even when they are clearly present in the input layer, or the tracking system incorrectly identifies them as new objects.

I believe the challenges stem from the nature of my data. I'm attempting to track lampposts while the camera is in motion, and the size of the lampposts varies from frame to frame.

Do you have any suggestions for improving tracking, especially considering training the model on stationary objects while the camera is in motion, rather than the conventional approach of tracking moving objects?

HibaDoi commented 1 week ago

comparaison.pdf here a file that contain the input mask and the output mask of the tracking .

hkchengrex commented 1 week ago

Thank you for the update. Is there only one frame as input per video?

HibaDoi commented 1 week ago

thank you for interacting ,no there more than 100 frames

hkchengrex commented 1 week ago

I mean annotated frames with masks.

HibaDoi commented 1 week ago

i created a directory containing : images =>100=>{100 frames ] source =>100=>[100mask+100json }

hkchengrex commented 1 week ago

For debugging, can you try tracking with just one mask (i.e., the first frame)? This isolates the tracking from the detections.

HibaDoi commented 1 week ago

Can you elaborate plz. because the first frame can appear good.

hkchengrex commented 4 days ago

DEVA combines detections with propagated masks. This helps to identify whether it is a problem of merging/detection or a problem of propagation.