hkchengrex / Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation
https://hkchengrex.com/Tracking-Anything-with-DEVA/
Other
1.27k stars 129 forks source link

U-net with DEVA #79

Closed wkywwds closed 7 months ago

wkywwds commented 7 months ago

How can I generate the data format like example/vipseg/source/12_1mWNahzcsAc by U-net.

hkchengrex commented 7 months ago

Hi. You can follow the data format and create the mask and .json files. I can provide more information if you have more specific questions.

wkywwds commented 7 months ago

Can the "id" in the json file be defined manually, or must it be generated by my own detector? Also, when "isthing" is fasle, does it represent the background? If so, can I not mask the background? Finally, what is the difference between the data formats in "vipseg" and "vos" and how are they used ![Uploading 9087689b74dd7debb764bd5f2184465.png…]()

wkywwds commented 7 months ago

1

wkywwds commented 7 months ago

Hi. You can follow the data format and create the mask and .json files. I can provide more information if you have more specific questions.

Can the "id" in the json file be defined manually, or must it be generated by my own detector? Also, when "isthing" is fasle, does it represent the background? If so, can I not mask the background? Finally, what is the difference between the data formats in "vipseg" and "vos" and how are they used

hkchengrex commented 7 months ago
  1. Doesn't matter.
  2. No. See the difference between thing and stuff in the COCO stuff paper.
  3. VOS is for the VOS task. VIPSeg is for the video panoptic/unsupervised/universal segmentation task.
wkywwds commented 7 months ago

If it is said that the object of segmentation is not video, but a series of pictures, that is, the video is extracted frame by frame to generate a series of graphs. In this case, Need I detect all the images in the series and give each object in the picture an id?

wkywwds commented 7 months ago
  1. Doesn't matter.
  2. No. See the difference between thing and stuff in the COCO stuff paper.
  3. VOS is for the VOS task. VIPSeg is for the video panoptic/unsupervised/universal segmentation task. And Is the "isthing" referred to in the second point necessary?
wkywwds commented 7 months ago

1 How does the "id" associate with the masked objects in the image, such as horses and people 00001258

hkchengrex commented 7 months ago

The id is the pixel value of the corresponding object. If you read the mask file with PIL, you can inspect the pixel values. The algorithm tracks whatever that you detect.

wkywwds commented 7 months ago

69e4dad47b12ea02ec52a7cefe3315f I want to ask if this "--json_path" is a file directory or a file

hkchengrex commented 7 months ago

Directory.

wkywwds commented 7 months ago

Hello, when running the DEMO item in the file EVALUATION.md, why did the following error occur?The script's parameters are set to: python eval_with_detections.py --mask_path "./example/vipseg/source" --img_path "./example/vipseg/images" --dataset demo --temporal_setting semionline --output "D:/Tracking-Anything-with-DEVA-main/example/output" --chunk_size 1

24 04 13

hkchengrex commented 7 months ago

We do not officially support Windows but I have pushed a fix that might help with your problem. In the future, please open a separate issue for problems unrelated to this topic.