fieldsoftheworld / ftw-datasets-list

Description of FTW dataset and list of field boundary datasets it includes
https://fieldsofthe.world/
2 stars 0 forks source link

detectron2-friendly labels? #2

Open fangzp opened 1 week ago

fangzp commented 1 week ago

Hello! Great work on putting together such a useful and large dataset.

I was wondering if there was any talk of potentially providing the vector labels for each image in a format which can be readily ingested by detectron2 and/or mmdet. I'm thinking about for example if one wanted to train an instance segmentation with Mask-RCNN or other more bespoke models, or otherwise wanted to work on a model which isn't already built into torchgeo. Detectron2's guide to custom datasets require labels to be COCO-like, which from my recent experience is still fairly cumbersome and baroque requiring a fair amount of data wrangling. If any of you have experience converting GeoParquet to a COCO-like format, directing towards resources would be great, otherwise I will probably need to convert GeoParquet to GeoJSON and then to COCO.

In sum, any suggestions for the least painful way to getting the ag field labels into a format which can be plugged into a generic computer-vision framework would be great. Thanks in advance!

(ETA: if this would be more suited to the ftw-baselines Issues page, please move it there instead. Thanks again!)

cholmes commented 1 day ago

I'm not experienced with vector labels, so will wait for others to sound in, but it sounds interesting to me. I don't have any experience converting from GeoParquet to COCO, but if you can share any links on how to go from GeoJSON to COCO then we might be able to add a tool to the ftw command-line to convert from FTW GeoParquet to it (maybe directly, but at the least could make a little tool that automatically goes to GeoJSON then to COCO). There's space to make a set of commands (ftw processing?) to convert the FTW Fiboa boundaries to labels - we need to open source the code that was used to make the raster masks from satellite imagery, but we could also add functionality there to go to COCO.

(And I do think this is the right repo vs ftw-baselines, at least for now - sorry for the delay on anyone seeing this)