Kitware / dive

Media annotation and analysis tools for web and desktop. Get started at https://viame.kitware.com
https://kitware.github.io/dive
Apache License 2.0
80 stars 21 forks source link

reusing pascal-voc annotation to re-train a VIAME detector #1294

Open epifanio opened 2 years ago

epifanio commented 2 years ago

I am using VIAME desktop on Linux with GPU support - I have a number of annotated seafloor images (~58K HabCam V4) - is there a way to further train a VIAME detector/network based on my annotation? The annotations are available as an XML file (one for each image) in pascal-voc format, I may try to convert the annotation in Yolo format if that can help. Thanks for any advice!

mattdawkins commented 2 years ago

is there a way to further train a VIAME detector/network based on my annotation?

Yes

We only support the file formats listed here:

https://viame.readthedocs.io/en/latest/section_links/detection_file_conversions.html

Predominately VIAME CSV and COCO JSON across the board, in both the DIVE interface and project folders - so you'd need to convert to one of them if you want to train things out of the box, or alternatively write file format parser code for Pascal-VOC and add it to https://github.com/VIAME/VIAME/tree/main/plugins/core

epifanio commented 2 years ago

@mattdawkins thank you for your reply -

I guess the reader code is in ead_detected_object_set_habcam.cxx is that correct?

from the sample line:

/// "201503.20150525.102214751.575250.png" 185 "boundingBox" 1195 380 1239 424

the fields are the ones described in the link to the docs

- 1: Detection or Track Unique ID
- 2: Video or Image String Identifier
- 3: Unique Frame Integer Identifier
- 4: TL-x (top left of the image is the origin: 0,0)
- 5: TL-y
- 6: BR-x
- 7: BR-y
- 8: Auxiliary Confidence (how likely is this actually an object)
- 9: Target Length

Q: I guess

- 8: Auxiliary Confidence (how likely is this actually an object)
- 9: Target Length`

are optional, Is that correct or should I pout a default value for those fields?

Ideally, I'd like to start from a pre-trained network and add my dataset to fine-tune the detector. I can begin by adding annotation for a pair of classes (scallops and sea anemones) - May I ask for a pointer to the docs on how to initiate the training process?