roboflow / supervision

We write your reusable computer vision tools. 💜
https://supervision.roboflow.com
MIT License
18.59k stars 1.45k forks source link

Save detection area with CSVSink #1397

Open robmarkcole opened 1 month ago

robmarkcole commented 1 month ago

Search before asking

Description

I would like to save all the Detections data to csv. Currently the area is not saved, and my approach below isn't successful. This FR is to save the area (and potentially other detection attributes that are currently not saved)

with sv.CSVSink(csv_path) as sink:
    for detection in detections:
        sink.append(detection, {"area": detection.area})
AttributeError: 'tuple' object has no attribute 'area'

Use case

I will perform filtering in a separate application

Additional

No response

Are you willing to submit a PR?

robmarkcole commented 1 month ago

I've also got some functions which will convert the output of JSONSink to coco_json, for importing as pre-annotations. This currently has to calc the area and parse the xyxyxyxy with regex, so if these were exposed more simply this process would be streamlined

SkalskiP commented 1 month ago

@robmarkcole, you should pass the whole sv.Detections object as a sink.append argument. No loop is needed.

with sv.CSVSink(csv_path) as sink:
    sink.append(detections, {"area": detections.area})     
SkalskiP commented 1 month ago

@robmarkcole, I see you reacted. Did that solve your problem? If so, I'm closing the issue. ;)

robmarkcole commented 1 month ago

I still think it would be a nice feature to have out of the box, along with xyxy for the polygon

robmarkcole commented 1 month ago

That appears to result in the entire area array being saved for every row:

image
SkalskiP commented 1 month ago

Hi @robmarkcole, unfortunately, I won't have time to dig deeper into this problem this week. I'm not going to lie; JSONSink and CSVSink could use some love.

Let me tag @LinasKo here so he can take a look next week.

LinasKo commented 1 month ago

Hi @robmarkcole 👋

I'll look into this.

LinasKo commented 1 month ago

This goes surprisingly deep.

@robmarkcole, syntax-wise, the cleanest solution is to add the areas into data, as it's treated differently than custom_data.

with sv.CSVSink(csv_path) as sink:
    detections.data["area"] = detections.area
    sink.append(detections)

This will modify the detections object if you're using it later, however.


@SkalskiP, I can't find @PawelPeczek-Roboflow's issue where he asked for collection-level storage in sv.Detections. Something like data, but applicable to all images, like 'camera_id'.

The issue here is, basically, custom_data is treated as an even more general data, capable of storing collection-level vars. It may store a scalar, a dict - basically anything. We could reduce it to only store lists of things like data, but it will break some implementations.

A good solution would be to first solve Pawel's request, and then come back to edit CSVSink (with deprecations).

SkalskiP commented 1 month ago

@LinasKo, how about we collect the list of potential improvements for CSVSink and work on v2?

robmarkcole commented 1 month ago

A related feature request, DataFrameSink as a convenience to avoid

  csv_path = f'/tmp/{image_id}.csv'
  with sv.CSVSink(csv_path) as sink:
      sink.append(detections, {})

  df = pd.read_csv(csv_path)