tdwg / camtrap-dp

Camera Trap Data Package (Camtrap DP)
https://camtrap-dp.tdwg.org
MIT License
45 stars 5 forks source link

ObservationLevel: Object? #390

Open ddachs opened 1 month ago

ddachs commented 1 month ago

I am curious if we are overlooking the (so far) most detailed observation level, specifically one that pertains to objects within media files. Currently, CamtrapDP supports two levels: media and event. When identifying objects in a media file (e.g., using MegaDetector), the count in the observation table will always be 1. For this reason, we moved from the media level to a more granular object level. I believe this distinction is crucial when generating count values for events, as having data at the media or object level call for different approaches.

peterdesmet commented 1 month ago

Can you clarify you question? Here's an attempt at providing info 😄

  1. It is possible to express object level observations in Camtrap DP. For that you use media-based observation (observationLevel = media, mediaID = not NULL) and use using bboxX, bboxY, bboxWidth, bboxHeight to indicate where the object was observed. All remaining properties in the observation then apply to that object. Example:

https://github.com/tdwg/camtrap-dp/blob/6a903e9b34e05cbd54f04541161e849a1c3d2108/example/observations.csv?plain=1#L509-L512

This can be used to draw the bounding box: https://camtrap-dp.tdwg.org/example/62c200a9/#7245a2aa

  1. Regarding how individualCount should be summed (without over-summing), we currently state the following in observationLevel:

Level at which the observation was classified. media for media-based observations that are directly associated with a media file (mediaID). These are especially useful for machine learning and don't need to be mutually exclusive (e.g. multiple classifications are allowed). event for event-based observations that consider an event (comprising a collection of media files). These are especially useful for ecological research and should be mutually exclusive, so that their count can be summed.

On point 2, we may be moving towards an approach where observations.csv contains the biologically relevant information (one truth, with clear approach on how to create events). While for machine learning, it is useful to have all classifications (multiple truths), which are provided in an annotations.csv. See a very early draft proposal at https://github.com/tdwg/camtrap-dp/pull/389

kbubnicki commented 1 month ago

Hey, as I understand @ddachs suggests to extend the list of possible options for the observationLevel field with object (or sth like sub-media) to better guide users when they do data aggregations by themselves. Correct @ddachs ?

ddachs commented 1 month ago

@kbubnicki correct!

peterdesmet commented 1 month ago

Makes sense to expand it to object, but we should think about terminology:

ddachs commented 1 month ago
  1. object is a spatial region in a picture or video frame.
  2. a temporal region to me in clearly an interval. (I got used to lubridates terminology)
  3. good question regarding submedia: On the one hand submedia keeps the hierarchichal terminology (event > media > submedia), but on the other hand, lacks the information what it exactly is. I prefer the differentiation.
  4. I hope, that the combination is possible. Our observation tables will explode, if we split all videos into single frames.
peterdesmet commented 1 month ago
  1. Agree, interval is the best term

  2. If combinations are possible, what term to use then? object or interval?