Closed peterdesmet closed 1 year ago
I agree.
Implemented at #204. To be discussed.
Open questions and my preference:
parentMediaID
be empty when there is no parent? _yes, even when queries would be easier if it were populated, see: https://github.com/inbo/movepub/blob/71cd323b3b5af0c287c60b22ff1b34f38160054b/inst/sql/camtrap-dp/dwc_multimedia.sql#L58-L68_captureMethod
-> creationTechnique
? yesparentMediaID
-> sequenceID
? nostart
and end
timestamp. no, can be derived from the media files (except video) and would needlessly inflate data.start
-> startTimestamp
yes, and also in deployments~end
-> endTimestamp
yes, and also in deployments~obs:mediaID
remains http://purl.org/dc/terms/identifier? yesmedia:parentMediaID
? don't knowmedia:deploymentID
= eventID? yesmedia:start
is http://rs.tdwg.org/ac/terms/startTimestamp? tempting, but not really a ROI~media:end
is http://rs.tdwg.org/ac/terms/endTimestamp? tempting, but not really a ROI~I'm in favor of this change. It simplifies things. Personally I need to get used to term mediaID but if you think of sequences as frames that could just as well have been a video it's easy to understand.
I also find this intuitive in terms of the class layout and relationships.
Sequences are considered media (not unlike videos), they get their own rows in media.csv
~A video is a piece of media, with a binary serialization in a format while here it's really just a field to group individual media files. Looking at the possible terms the only ones you'd anticipate maybe relevant are the timestamp
and the possibly the comments
and captureMethod
. Would they ever exist on the sequence row or in different ways to the image media rows?~
~What I'm wondering is if having a row for the sequence brings any benefit, say to e.g. keeping the sequenceID
column which is very intuitive.~ (answering my own question. It's needed to simplify the observation join)
Out of curiosity - are images ever manipulated, e.g. cropping out a section and creating a new image? If so, the parentMediaID
seems very appropriate and intuitive.
I think it might be useful to add a type
to Media (Image, SequenceOfImages, Video) to remove any assumptions. At the moment, you need to infer that because parentMediaID=null
then it's a sequence, but if people create sub-images (e.g. cropping, adjusting brightness etc) that may not hold true.
are images ever manipulated, e.g. cropping out a section and creating a new image? If so, the parentMediaID seems very appropriate and intuitive.
@timrobertson100 not necessarily to create a new physical medium. But subsections (bounding boxes) of images are quite common by e.g. AI to indicate where in the image it noticed an animal. That info can currently not be captured in Camtrap DP v1 (needs more thought). Options are to represent those as as sub-images (with a parentMediaID
), but more likely is adding a bounding box field to the observation.
I think it might be useful to add a type to Media (Image, SequenceOfImages, Video) to remove any assumptions.
I agree. parentMediaID=null
is not a good filter, because a dataset with image-based observations (only) would only contain images with parentMediaID=null
too. Options:
time lapse
, motion triggered
, add sequence
. Change definition from "capture" to "how was it created". Also solves the issue of having to assign time lapse
/motion triggered
to the sequence level (what if contains mixed children?)image/jpeg
, video/mp4
, could add something like application/sequence
. A bit of a hack, can one create mediaTypes? Solves same issue as captureMethod
for mixed children.Mainly for reasons of keeping things intuitive, and to avoid mixing concepts I'd favor a type (or similarly named) field.
By mixing concepts, I mean that capture is related to what happened in the field to "trigger" the media existing, fileMediaType is about the encoding of the binary stream and sequence is really just a grouping of items largely for data management purposes (i.e. allow you to refer to a grouping of items in an annotation). Those seem like separate concerns to me which warrant their own field.
Aside: this model implies media would only ever exist in a single sequence unless you duplicate media records with e.g. the same filename (meaning observations are based on an image in a particular sequence and not on the image itself). I don't know enough to comment if that is appropriate.
As far as concerns the second remark : that looks right to me : an image only exists in one single sequence - however, the same image can be the source for two different observations;
For me it still is a little bit confusing that, if I get it right, in the new data model in the media.csv there are some records referring to single images and others records referring to sequences that contain images that are listed in the same media.csv -table. It looks to me like two different levels of information are contained within the same table - not being a data scientist this is the first time I encounter this kind of a mixed-levels table in a data model :-)
It is a common modeling pattern to include multiple subtypes of an entity within a single table and to distinguish them with a type field to void having to create additional tables or hierarchical structures. Here that pattern seems well justified. Another part of that pattern is to name the type field based on the table it is in and concept it represents so that it can stand alone without context in a data dictionary (a glossary of terms). Based on these practices, I would recommend the term be adopted and that it be called "mediaType".
This may a bit overly cautious, but I'd opt for acquisitionType instead of mediaType to avoid confusion/overlap with the common use of mediaType as a reference to the MIME Media Types.
@ben-norton I think this probably arises from the media table serving multiple roles for the sake of simplification. I agree that the mediaType should be limited to media types - digital results. I think that still needs to be there. To me the acquisitionType is a statement about the event (something not explicitly modeled by the Camtrap DP structure) that generated the result. In a model that expresses this activity explicitly, I would indeed include something to specify that. In the GBIF publishing model we're doing in parallel, that would be an eventType.
@tucotuco: It is a common modeling pattern to include multiple subtypes of an entity within a single table and to distinguish them with a type field to void having to create additional tables or hierarchical structures.
I'm not sure mixing (sub)types is that common. To me it is the biggest icky factor in an otherwise elegant proposal (cf comments by @jimcasaer @timrobertson100). I'd there for like to suggest an approach that deviates less from the current situation. For clarity, I'm also naming the proposals:
parentMediaID
)deploymentID
or timestamp
shortcuts in observations.sequenceID
in observations.csv and media.csv (cf current situation).mediaID
or sequenceID
. The fields should not be populated together. This is a change from the current situation, where sequenceID
(even when not used) has to be populated, and it gives equal weight to both approaches. It does make joins conditional, but it is a pretty clear WHERE obs.mediaID IS NOT NULL
vs WHERE obs.sequenceID IS NOT NULL
media.csv
does not mix concepts: it only contains physical files. Sequences are a grouping identifier sequenceID
(cf. current situation).media.csv
mediaID | sequenceID | deploymentID | timestamp | filePath
------- | ---------- | ------------ | ------------------- | --------
med1 | NULL | dep1 | 2020-01-01T00:00:00 | med1.jpg
med2 | NULL | dep1 | 2020-01-01T00:00:01 | med2.jpg
med3 | NULL | dep1 | 2020-01-01T00:00:02 | med3.jpg
observations.csv
observationID | mediaID | sequenceID | observationType | scientificName | count | countNew
------------- | ------- | ---------- | --------------- | -------------- | ----- | --------
obs1 | med1 | NULL | animal | Sus scrofa | 1 | 1
obs2 | med2 | NULL | animal | Sus scrofa | 1 | 0
obs3 | med3 | NULL | blank | NULL | NULL | NULL
media.csv
mediaID | sequenceID | deploymentID | timestamp | filePath
------- | ---------- | ------------ | ------------------- | --------
med1 | seq1 | dep1 | 2020-01-01T00:00:00 | med1.jpg
med2 | seq1 | dep1 | 2020-01-01T00:00:01 | med2.jpg
med3 | seq1 | dep1 | 2020-01-01T00:00:02 | med3.jpg
observations.csv
observationID | mediaID | sequenceID | observationType | scientificName | count | countNew
------------- | ------- | --------- | --------------- | -------------- | ----- | --------
obs1 | NULL | seq1 | animal | Sus scrofa | 1 | NULL
@peterdesmet I understand what you are trying to do, and even why. It only makes me cringe from a database modeling perspective where in SQL databases one tries to achieve the highest reasonable Normal Form (https://en.wikipedia.org/wiki/Database_normalization#Normal_forms) to protect against redesign problems with changes that might come in the future.
In Suggested Change 2 you are treating sequences as properties (albeit properties of two distinct entities), not as identifiers of an entity to use in the role of a key. The reason you can "get away with that" is that sequences have no non-identifying properties. So the thing that worries me (the "cringe factor") is that you are painting yourself into a corner. If you ever do add non-identifying properties to sequences in the future, you will have to repeat that information in media.csv or observations.csv or both, or add a sequence.csv with relationships to media and observations, and thereby change the structure in a way that will break existing implementations. Suggested change 1 doesn't overcome future-proofing sequences either, by the way, it treats them as one of the types of media with no properties of their own.
For demonstration only, a model that would future-proof sequences (and be in 5th normal form - 5NF) would be something like the following:
sequence.csv
sequenceID | deploymentD | starttimestamp
---------- | ----------- | -------------------
seq1 | dep1 | 2020-01-01T00:00:00
seq2 | dep1 | 2020-02-01T00:00:00
media.csv
mediaID | sequenceID | timestamp | filePath
------- | ---------- | ------------------- | --------
med1 | seq1 | 2020-01-01T00:00:00 | med1.jpg
med2 | seq1 | 2020-01-01T00:00:01 | med2.jpg
med3 | seq1 | 2020-01-01T00:00:02 | med3.jpg
observationID | observationType | scientificName | count | countNew
------------- | --------------- | -------------- | ----- | --------
obs1 | animal | Sus scrofa | 1 | 1
obs2 | animal | Sus scrofa | 1 | 0
obs3 | blank | NULL | NULL | NULL
obs4 | animal | Sus scrofa | 1 | NULL
mediaobservation.csv
mediaID | observationID
------- | -------------
med1 | obs1
med2 | obs2
med3 | obs2
sequenceobservation.csv
sequenceID | observationID
---------- | -------------
seq2 | obs4
Commenting here as a relative outsider to the project. Overall I think this goes in the right direction: deployments create media, media lead to observations. In my opinion sequences are an artificial add-on without any real benefits, but I never used it myself and also don't really how sequences are meant to be used in this standard, so I may be missing important points. Below are some general notes, concerns and questions to consider, and a suggestion for a somewhat different database system that may help accomodate sequences and other things. Apologies for a long post ahead.
sequence_interval
as a workaround: I understand the motivation, but it is arbitrary, and hence deviates from the actual observation process already. It may accidentally join independent trigger events as a single sequence, or may separate images taken within a single trigger event (if e.g. 5 images are taken per trigger event and one happens to be a bit late). This may not be big issue in practical terms, but conceptually it is in my opinion. It makes sequences arbitrary and artificial. I see three possible cases (with their data relationships):
A: easiest option. No sequences needed at all. If for some compatibility reason it is necessary to always have a sequence table, each media item can be considered a separate sequence and data structure would be identical to B (it would be redundant and a bit silly though).
B: can be created automatically from image-based annotation in A using sequence_interval
(see below). It would only introduce an intermediate sequence table and sequence IDs in the observation table.
If B is created from A, then B still implies A (as long as observations in B retain their mediaID). Not sure if that is relevant.
C: is this even necessary (can media.csv be missing)? Maybe relevant for old data sets?
The only real difference is: A: observations refer to media ID B: sequence table exists. observations refer to sequence ID, sequences refer to media.csv C: observations refer to sequence ID, which directly refers to deployments.
Would it be possible to set a flag in the project metadata as to which case it is (and thus, which key to use)?
sequence_interval
, and sequence_interval
is a user-defined time difference between media items, then sequences can be assigned automatically using the mediaID and media timestamps. I imagine a simple function that takes deployment, media and observation csvs as input, the user defines sequence_interval
, and the function calculates time differences between media items and automatically assigns media items to sequences. The points above are for images only. Video support in this scheme may lead to additional complications:
sequence_interval
?sequence_interval
is e.g. 2 seconds, would the two videos be one or two sequences?I suggest having a look at the database structure of digiKam for inspiration. I find it very clear, logical and extensible, but different from the current cameratrap DP scheme. If you have digiKam installed you can open its database in R with:
camtrapR:::accessDigiKamDatabase(db_directory = "C:/Users/YOURUSERNAME/Pictures",
db_filename = "digikam4.db")
In short, it contains 5 items:
"ImageTags" - assignment of tags to images
This is the content of each of these items as used by digiKam (not all of which would be needed for camera trapping data):
$AlbumRoots [1] "id" "label" "status" "type" "identifier" "specificPath"
$Albums [1] "id" "albumRoot" "relativePath" "date" "caption" "collection" "icon"
$Images [1] "id" "album" "name" "status" "category" "modificationDate" "fileSize" "uniqueHash" "manualOrder"
$Tags [1] "id" "pid" "name" "icon" "iconkde"
$ImageTags [1] "imageid" "tagid"
This scheme can be expanded nicely, e.g. a separate table for sequences (which assigns sequences to the file ids in the "Images" table - can maybe be created automatically as mentioned above). This would allow easily gathering of image tags (species IDs etc) and image information (timestamps etc) for sequences.
It would also allow easy linking to AI / deep learning methods, e.g. with a separate table containing bounding box coordinates for object detection. This would work both for model training and model deployment, and can maybe be based on the COCO camera traps format. It would also remove the need to crop / duplicate images.
Then there can be another table containing the labels and confidence values for these bounding boxes. For model training this second table only needs one label, for predictions it can either contain the top label and probability only, or top k labels, or all labels with their probabilities.
Also, all these deep learning methods for image classification / object detection that I'm aware of use images, not sequences. Sequences can actually be harmful in this respect, especially for image classification (when the animal walked out of the frame during the sequence, but the entire sequence is labelled as a species). In object detection, bounding boxes for sequences also don't make sense. They need to be image-specific. *
* EDIT: COCO camera trap format allows both image and sequence-specific bounding boxes, which may not be precise at image-level (see link above). I find the statement that 'sequences are the "atom of interest" in most ecological applications' questionable though.
Video annotation at the file level should be no different than image annotation. I don't know how to annotate at the frame level.
Thanks @tucotuco and @jniedballa! I had some time to digest this information and discussed it with @damianooldoni. We think the following suggestion would be a model that answers the issues. It will not solve - but can represent - the fact that some systems make observations at the level of "sequences/groups of images" (which restricts creating smaller events at the analysis stage).
evidence
~ mediaGroup
. For image-based observations, it will contain 1-to-1 relations, for sequence-based observations, it will contain 1-to-many relations.mediaGroupID
and mediaID
with the same identifiers (see examples below). On the user side, it simplifies joins and allows to use a single model to represent different use cases. It also avoids the "paint yourself into a corner" problem @tucotuco pointed out with the more succinct representation in suggested change 2.evidence
over e.g. mediaGroup
.~ That conflicts somewhat with the name mediaGroup
, but I find it still a more intuitive name.sequence
is avoided altogether, because it has different meanings. Here we use mediaGroup
as the group, media file or part of media file that was used as the basis for an observation.level
(see below) could be added (with a controlled vocabulary) to more easily filter certain observations. For easier discovery, the metadata term classificationLevel could be updated to contain a list of all the levels a dataset contains.Example:
obs1
, obs2
, obs3
are image-based observations. In med3
no animal was seen.obs4
is a group-based observation. Media files med1
, med2
, med3
where assessed as a whole (a disadvantage for later analyses, but often occurring).obs5
is made on a part of med3
, i.e. a specific bounding box. It is considered a separate mediaGroup.obs6
is an observation based on a part of a video, i.e. a specific duration with start and end timestamp.media.csv
mediaID | deploymentID | timestamp | filePath
------- | ------------ | ------------------- | --------
med1 | dep1 | 2020-01-01T00:00:00 | med1.jpg
med2 | dep1 | 2020-01-01T00:00:01 | med2.jpg
med3 | dep1 | 2020-01-01T00:00:02 | med3.jpg
med4 | dep1 | 2020-01-04T08:00:00 | med4.mov
mediagroups.csv
mediaGroupID | mediaID | level | boundingBox | timeRange
------------ | ------- | -------- | ------------------ | ---------
med1 | med1 | file | |
med2 | med2 | file | |
med3 | med3 | file | |
seq1 | med1 | group | |
seq1 | med2 | group | |
seq1 | med3 | group | |
bbox1 | med1 | bbox | [x,y,width,height] |
duration1 | med4 | duration | | start/end
observations.csv
observationID | mediaGroupID | observationType | scientificName | count
------------- | ------------ | --------------- | -------------- | -----
obs1 | med1 | animal | Sus scrofa | 1
obs2 | med2 | animal | Sus scrofa | 1
obs3 | med3 | blank | NULL | NULL
obs4 | seq1 | animal | Sus scrofa | 1
obs5 | bbox1 | animal | Sus scrofa | 1
obs6 | duration1 | animal | Vulpes vulpes | 1
@jniedballa sequence_interval
is currently saved in the project metadata. But maybe we should allow more flexible ways to indicate how "mediaGroups" were created.
@peterdesmet as a relative outsider I like the look of this new "suggested change 3" better than previous ones. It seems correct to me that sequences are not considered media files.
Your bounding box example is clear; I can see that the format also allows for an observation which is based on a bounding-box that moves/changes shape over the duration of a sequence (this is one "tricky case" we discuss sometimes). But then would the level
be bbox
or group
? My solution to that would be to forget having bbox
as an explicit type: it can be implicit from the fact that the boundingBox
column is non-null. (I'm dubious about the need for the level
column at all, but I presume you're suggesting it for ease of data consumption.)
I'm dubious about the need for the level column at all, but I presume you're suggesting it for ease of data consumption.
Yes indeed. It doesn’t necessarily need to be there.
Alternative name for evidence
: observationUnit
.
@danstowell could we consider that 4th table a "region of interest" (Section 7.11 of https://ac.tdwg.org/termlist/)?
Regions of Interest (ROI) designate specific parts of media items.
Could a region of interest also be larger than a single image file?
@danstowell could we consider that 4th table a "region of interest" (Section 7.11 of https://ac.tdwg.org/termlist/)?
Regions of Interest (ROI) designate specific parts of media items.
Could a region of interest also be larger than a single image file?
We always intended that an ROI could cover multiple frames, but we have not worked out the details. In practice I think the AC definition of ROI is all about a hyper-rectangular box (e.g. imagine a box confined in the x, y, z and time axes), whereas what's nice about your proposal is that an observation is composed of a sequence of different* ROIs, one per frame. A sequence of different ROIs is not a hyper-rectangular box. Thus: I think the 4th table is not equivalent to an ROI.
I would say that your columns timeRange
and boundingbox
are closely tied to AC's notion of ROI.
FWIW I'm OK with mediaGroups
. (I prefer it over observationUnit
)
Hi all and sorry for this late feedback! Great discussion! I have spent some time recent days thinking about the last proposal and have had the meeting with @peterdesmet this morning. Here is the outcome; below you will find two new proposals that (hopefully) still add something to our discussion:
media.csv
| mediaID | deploymentID | timestamp | filePath |
|-------------|--------------|---------------------|----------|
| med1 | dep1 | 2020-01-01T00:00:00 | med1.jpg |
| med2 | dep1 | 2020-01-01T00:00:01 | med2.jpg |
| med3 | dep1 | 2020-01-01T00:00:02 | med3.jpg |
| med4 | dep1 | 2020-01-04T08:00:00 | med4.mov |
mediagroups.csv
| mediaGroupID | mediaID |
|--------------|---------|
| med1 | med1 |
| med2 | med2 |
| med3 | med3 |
| med4 | med4 |
| seq1 | med1 |
| seq1 | med2 |
| seq1 | med3 |
observations.csv
| observationID | mediaGroupID | observationLevel | observationType | scientificName | count | individualID | boundingBox | timeRange |
|---------------|--------------|------------------|-----------------|----------------|-------|--------------|-------------------------------------------------|-------------|
| obs1 | med1 | file | animal | Sus scrofa | 1 | | | |
| obs2 | med2 | file | animal | Sus scrofa | 2 | | [[x1,y1,width1,height1],[x2,y2,width2,height2]] | |
| obs2a | med2 | file | animal | Sus scrofa | 1 | ind1 | [[x1,y1,width1,height1],] | |
| obs2b | med2 | file | animal | Sus scrofa | 1 | ind2 | [[x2,y2,width2,height2],] | |
| obs3 | med3 | file | blank | NULL | NULL | | | |
| obs4 | seq1 | sequence | animal | Sus scrofa | 2 | | | |
| obs5 | med4 | file | animal | Sus scrofa | 1 | | [[x,y,width,height],] | start1/end1 |
| obs6 | med4 | file | animal | Sus scrofa | 1 | | [[x,y,width,height],] | start2/end2 |
| obs7 | med4 | file | animal | Sus scrofa | 1 | | | start/end |
| obs8 | med4 | file | animal | Sus scrofa | 1 | | | |
We keep the mediagroups.csv
table. The advantages are that we can mix sequence- and file-based observations in one package and that this table can be easily extended when needed in the future.
The attributes boundingBox
(spatial window; now 2D array) and timeRange
(temporal window) are moved to the observations.csv
table. I find both attributes more related to the observation
than media-grouping
process. Think about two-stage observation process: i. animals (or other objects as humans, vehicles etc) detection in space (boundingBox
) and/or time (timeRange
) -> ii. classification (observationType
, scientificName
etc). The advantage of this change is also that the mediagroups.csv
table will be more "compressed" as it will not have rows for each single detected object (bounding box) and/or video-frame e.g. imagine 10k videos * 60 1s frames classified by AI and each containing from 1-10 wild boar. In the previous proposal both mediagroups.csv
& observations.csv
tables would quickly grow enormously in similar scenarios.
In the observations.csv
table there is a new attribute observationLevel
- this is just for user's convenience (e.g. quick selection of file-based observations only).
This proposal (as well as the next one) supports the following cases:
a) file-level observations -> obs1
(image) and obs8
(video)
b) file-level & object-based image observations -> obs2
(multiple objects of the same type on 1 image), obs2a
& obs2b
(different objects on 1 image, separate rows),
c) file-level & object-based video observations -> obs5
& obs6
(same or different objects detected on separate video frames; both spatial and temporal window defined), obs7
(only temporal window of an observation defined); please note that a similar logic can be applied to audio files
d) sequence-based observations -> obs4
Maybe a trivial comment, but an interesting side-effect of having mediaGroupID
for file-based observations is that one can define mediaGroupID
for pairs of images from 2-cameras deployments e.g. when monitoring lynx, tigers or some other "marked" animal species, where both cameras typically record media of the same individual (e.g. left & right side of an animal passing a forest path):
| mediaID | deploymentID | timestamp | filePath |
|-------------|--------------|---------------------|----------|
| med1a | dep1a | 2020-01-01T00:00:00 | med1.jpg |
| med1b | dep1b | 2020-01-01T00:00:00 | med1.jpg |
| mediaGroupID | mediaID |
|--------------|---------|
| med1 | med1a |
| med1 | med1b |
media.csv
| mediaID | mediaGroupID | deploymentID | timestamp | filePath |
|---------|--------------|--------------|---------------------|----------|
| med1 | seq1 | dep1 | 2020-01-01T00:00:00 | med1.jpg |
| med2 | seq1 | dep1 | 2020-01-01T00:00:01 | med2.jpg |
| med3 | seq1 | dep1 | 2020-01-01T00:00:02 | med3.jpg |
| med4 | seq2 | dep1 | 2020-01-04T08:00:00 | med4.mov |
observations.csv
| observationID | mediaGroupID | observationType | scientificName | count | individualID | boundingBox | timeRange |
|---------------|--------------|-----------------|----------------|-------|--------------|-------------|-----------|
| obs1 | seq1 | animal | Sus scrofa | 2 | | | |
| obs2 | seq2 | animal | Sus scrofa | 1 | | | |
media.csv
| mediaID | mediaGroupID | deploymentID | timestamp | filePath |
|---------|--------------|--------------|---------------------|----------|
| med1 | med1 | dep1 | 2020-01-01T00:00:00 | med1.jpg |
| med2 | med2 | dep1 | 2020-01-01T00:00:01 | med2.jpg |
| med3 | med3 | dep1 | 2020-01-01T00:00:02 | med3.jpg |
| med4 | med4 | dep1 | 2020-01-04T08:00:00 | med4.mov |
observations.csv
| observationID | mediaGroupID | observationType | scientificName | count | individualID | boundingBox | timeRange |
|---------------|--------------|-----------------|----------------|-------|--------------|-------------------------------------------------|-------------|
| obs1 | med1 | animal | Sus scrofa | 1 | | [[x,y,width,height],] | |
| obs2 | med2 | animal | Sus scrofa | 2 | | [[x1,y1,width1,height1],[x2,y2,width2,height2]] | |
| obs2a | med2 | animal | Sus scrofa | 1 | ind1 | [[x1,y1,width1,height1],] | |
| obs2b | med2 | animal | Sus scrofa | 1 | ind2 | [[x2,y2,width2,height2],] | |
| obs3 | med3 | blank | NULL | NULL | | | |
| obs4 | med4 | animal | Sus scrofa | 1 | | [[x,y,width,height],] | start1/end1 |
| obs5 | med4 | animal | Sus scrofa | 1 | | [[x,y,width,height],] | start2/end2 |
| obs6 | med4 | animal | Sus scrofa | 1 | | | start/end |
| obs7 | med4 | animal | Sus scrofa | 1 | | | |
1) There is no mediagroups.csv
table. Basically, we go back to the original model (v0.1.7, https://github.com/tdwg/camtrap-dp/tree/0.1.7) but there are some critical differences.
2) There are new attributes boundingBox
and timeRange
in the observations.csv
table (described above).
3) There is no deploymentID
in the observations.csv
table which makes the entire model more linear.
4) There is a new attribute mediaGroupID
in the media.csv
table.
5) The Camtrap DP packages should be either file-based or sequence-based (as indicated in the package-level metadata). It is not necessarily a limitation of this proposal; Camtrap DP has been designed as a standard for data exchange/publishing at a level of a single camera trapping project where typically people do not mix both annotation approaches.
6) The biggest advantage of this proposal I see it is the simplicity of the model (no 4th table) and its human-user-friendliness. Also the flexibility is still there, I believe most of the use-cases (as listed above) are covered with this design.
@peterdesmet Please edit this comment if you find that I have missed sth (or if sth is not clear enough)!
Best, K
Thanks @kbubnicki, great summary of our discussion. I just want to add that in suggestion 4 the number of records in mediagroups
is always going to be the same as there are records in media
(given you never mix a file and sequence based approach, which is a good limitation in my opinion). Knowing that, we can simplify things, which resulted in suggestion 5:
I’m all in favour of suggestion 5. Feedback welcome, especially from those that commented already @tucotuco @danstowell @jniedballa …
I'm not so excited by the idea of moving the bboxes into the observations
table, for the reason that it then fails to support one of the important use cases we have here: objects detected in image-sequences, with a different bbox in each image, and then one overall identification applied to that sequence of bboxes. This is a real example from our insect-cameras, and probably occurs in plenty of other systems with bboxes tracked over time.
A workaround would be to repeat multiple rows in observations
for each frame in this sequence, but that's tricky because we then wouldn't want users to sum the count
column and over-count.
I can't comment on the file-size implications.
You write "Think about two-stage observation process" (detect, then identify) but to me that doesn't motivate the change.
A separate and minor comment: I suggest that the arrays-of-bboxes format might be a bit troublesome for data consumers - it's starting to look like structured data inside a CSV cell.
I don't have a lot of time to comment in detail (i.e., offer alternative solutions) right now.
mediagroups
do not have distinct attributes.media
in a mediagroup
that has more than one media
. Changing the column mediaGroupID
in observations
to point to media
rather than mediagroups
would take care of that problem, but would not make it possible to allow observations
on mediagroups
directly. Solving both simultaneously takes me back to suggested change 1 where photos, photo sequences, and videos are all just media instances, distinguished by a type.mediagroups
do not overlap with the ids for media
(e.g., put medgr1 in place of med1 when med1 refers to a mediagroup
).Just had a chat with @peterdesmet about my most recent comments. If it will be a rule that data sets must be either of observations from media or observations from mediagroups, but never both, then my second concern doesn't really apply. Similarly, if data sets are never mixed, then the mediaID could act as a mediaGroupID for the sake of practicality (not having to mint another identifier). I cringe in terms of semantics (it was rejected that mediagroups were just a type of media), but that shouldn't matter until/unless these data start to be linked semantically.
I think the stipulation that a dataset is either sequence-based (observation - mediaGroup) or image-based (observation - media) is a fair stipulation that solves a number of problems. Since most datasets don't utilize multiple observation techniques (e.g., expert identification and computer vision model), adoption shouldn't be overly problematic for most providers. Several projects arrived at this same conclusion (after months of debate). To my knowledge, field testing this solution hasn't resulted in any significant problems. One important note. Aside from the logistics and organization of the model, the impact of this resides in the analysis. To combine sequence and image based observations for modelling purposes, the calculation technique for the number of unique individuals over a given period of time is crtical. The irony is that the image-based observations will be grouped over a specific time-interval for modeling purposes. In other words, its all sequences in the end.
A workaround would be to repeat multiple rows in
observations
for each frame in this sequence, but that's tricky because we then wouldn't want users to sum thecount
column and over-count.
@danstowell Thats why we have this field in Camtrap DP: https://tdwg.github.io/camtrap-dp/data/#observations.countnew
We use this field when annotating our camera trap records to track information about a "real" group size of animals staying for a while in front of a camera trap (or just passing it by). This applies to image-level annotation and prevents over-counting when aggregating data for analysis.
Hi all, I picked up this dormant issue with John Wieczorek (@tucotuco) in an effort to reach a recommendation. We mainly discussed the pros and cons of two of the main proposals suggested above:
I also compared how one would query data using either model, at https://github.com/peterdesmet/camtrap-dp-query-test (repository likely to be deleted at some point).
Our conclusion is that the mediaGroupID approach (Suggested change 5):
And thus a reasonable simplification of the model. It is an improvement over the current model (where information is needlessly repeated) and plays well with the unified common model. It allows to express bounding boxes (at the level of observations). If I read the comments above, this proposal is something that @kbubnicki @ben-norton @jniedballa and now @tucotuco could get on board with. I will create a pull request with the suggested changes. Thank you all for your patience and for participating in this discussion!
@danstowell you liked the possibilities of the 4th table approach - maybe especially as a model for Audubon Core - but for Camtrap DP we believe it would needlessly complicate things as an exchange format. Hope you understand.
One change we suggest is to rename mediaGroupID
to eventID
. As in, this is the event the data publisher choose to group their observations by. For image-based (recommended approach), the selected events are the duration of the media file (image or video), for sequence-based, the selected events are sequences. In software you can always create larger events (by grouping), but never smaller events.
Image-based (if we reuse identifiers):
media.csv
mediaID | eventID
------- | -------
med1 | med1
med2 | med2
observations.csv
observationID | eventID
------------- | -------
obs1 | med1
obs2 | med2
Sequence-based:
media.csv
mediaID | eventID
------- | -------
med1 | seq1
med2 | seq1
observations.csv
observationID | eventID
------------- | -------
obs1 | seq1
mediaGroupID
as a name: eventID
is more neutral in observations.csv and doesn't imply the media will be grouped (i.e. they aren't in file-based observations)sequenceID
as a name: sequenceID
is a confusing term in observations.csv for file-based observations.Quick update: we are still working on restructuring the model. The current approach is to abandon trying to capture image vs event-based annotation in a single observations
table, but to work with an eventobservations
and imageobservations
table (in addition to a media
and deployments
table).
The main advantage is clarity: easier for the user to understand and easier for us to document. Additionally, it allows to export both approaches in a single package, e.g. AI image-level observations that underpin event-level consensus observations.
We are currently testing this approach and hammering out the details.
The suggested change (splitting the observation table) has been implemented in #289. All who participated here are welcome to review the changes.
Fixed in Camtrap DP 0.6 #297.
Congrats. That's a very challenging task.
In a discussion with @tucotuco on how to better align Camtrap DP with a common model for biodiversity data, a proposal came up on how to better structure sequences in Camtrap DP.
Preamble
For the purpose of this discussion, I want to clarify what we mean by a sequence here:
sequence interval
"Maximum number of seconds between timestamps of successive media files to be considered part of a single sequence". As a result, a sequence can contain multiple triggers/bursts.sequence interval
is not a camera setting, but one by the programme used to manage the images afterwards.sequence interval
settings that were chosen. With image-based observations you can choose yourself how to group images together in logical events based on their timestamp.This proposal is not about whether image-based observations are better than sequence-based observations. The current situation is that both approaches exists (and likely will for a while) and Camtrap DP wants to support both.
The examples show a how the data would look for 3 images, using image-based vs sequence-based observations. In the first 2 images a wild boar (Sus scrofa) can be seen.
Current situation 0
sequenceID
), in both media and observations.sequenceID
andmediaID
, which are both foreign keys to the media table. Image-based observations need to populate both, sequence-based observations onlysequenceID
. As a result, joins between observations and media are conditional: you kinda need to know what key to use to make a join that will yield results. That is not great.deploymentID
andtimestamp
to observations, so that they can be easily joined with deployments - without having to go over media - to get useful biological data (location, time, species).Image-based observations
Sequence-based observations
Suggested change 1
In media.csv
parentMediaID
to associate them with sequences. That allows joins to find the images that belong to a sequence.filePath
andfileMediaType
become optional fields. They are typically not populated for sequence rows.In observations.csv
mediaID
. That media row can be a single image (image-based observations) or a sequence. This is a huge benefit, as it no longer required conditional joins.Most importantly, we think this model better represents the actual situation with camera traps: deployments → generate media → generate observations
Image-based observations
Sequence-based observations
Suggested change 2 (an less drastic update to the current situation)
This was suggested in https://github.com/tdwg/camtrap-dp/issues/203#issuecomment-1046656754. Comments above that are about suggested change 1 only.