inbo / camtraptor

Camtraptor is an R package to read, explore and visualize Camera Trap Data Packages (Camtrap DP)
https://inbo.github.io/camtraptor/
MIT License
10 stars 2 forks source link

Parsing issue with UTC offset `read_camtrap_dp()` #187

Closed lrdijkhuis closed 3 months ago

lrdijkhuis commented 1 year ago

Camtraptor version 0.19.2 When a UTC offset has been specified for only a part of the deployments, reading the Agouti export file withread_camtrap_dp() returns a parsing issue on dttm variables. Dttm objects with missing UTC offset will return <_NA_>, whereas the ones with UTC offset are handled correctly. This is especially relevant in older projects without project UTC-offset and without deployment specific UTC-offset. Is it a suggestion to build in a work around, extra argument, that adds a user specified offset to deployments without one?

damianooldoni commented 1 year ago

Thanks @lrdijkhuis to point this! As mentioned in inbo/camtraptor#188, camtraptor uses frictionless's read_resource() function to read deployments, media and observations. So, read_camtrap_dp() should allow the user to fully use the args of frictionless read_resource() function.

Can you provide me a datapacakge example to reproduce your error? Doing so, I will also be able to test that your deployments.csv is readable by frictionless' read_resource. If not, then I will ping @peterdesmet to check it in frictionless.

My first impression is that you cannot mix timestamp formats within a resource (deployments, observations, media), but I would like to have something for testing. And I hope to be wrong 😄

lrdijkhuis commented 1 year ago

Hi @damianooldoni, Thanks! I'm not in the position to share a datapackage. I've send you and email regarding this issue.

for now: this is an example in that gives the parsing error in the function: <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

DeploymentID | start | end -- | -- | -- -- | 2020-10-16T10:45:49 | 2020-10-22T20:31:38 -- | 2020-02-26T12:59:48 | 2020-03-12T22:04:48 -- | 2020-12-13T17:12:33+01:00 | 2021-01-16T17:02:25+01:00

damianooldoni commented 1 year ago

While working on inbo/camtraptor#188 I came to the conclusion that actually I was doing right, the camtraptor reading function uses the frictionless specs described in the datapackage.json file, so no need to expose args of read_resource() to users.

Not only, even by using basic frictionless we could not solve this issue properly as datetime format MUST be unique.

@lrdijkhuis: first solution is to change the format in the schema of deployments from "%Y-%m-%dT%H:%M:%S%z" to "". Probably you should do the same for the schema of media and observations as well. So, it's a change in the datapackage.json you can do manually.

Second solution could be a subcase of inbo/camtrapdp#105, something like:

correct_datetime(package,
  deploymentID = c("aaa", "bbb"),
  wrong_datetime = c(NA, NA),
  right_datetime = c(lubridate::as_datetime("2020-10-16T10:45:49"), lubridate::as_datetime("2020-02-26T12:59:48"))
)

Let me know what you think about it.

lrdijkhuis commented 1 year ago

Thanks for your reply. Regarding the first solution you propose: I cannot edit the deployments, observations, nor the media schema formats. these are retrieved within the resources in the datapackage.json and inaccessible for users. The second solution you propose is highly impractical, we would have to change all datedatimes throughout the 3 hierarchical structured files, hence: a manual dttm argument for all photo's, observations and deployments, maybe even the classificationTimestamp as well. This issue seems to be best solved in the agouti export tool, so all projects can profit and users dont have to apply bandages on every datapackage export. I suggest a default offset of 00:00h if none has been specified in the project settings, which is then overruled with the offset that is deployment specific (could be the same offset).

damianooldoni commented 1 year ago

@yliefting: I think @lrdijkhuis is right. The Agouti export routine is the best place to fix such issue.

peterdesmet commented 1 year ago

Thanks, I have reported this as an issue in the Agouti repository (link requires login).

peterdesmet commented 3 months ago

This is now resolved in Agouti 4 months ago, @yliefting commented:

This has been solved by setting all deployments to +00:00 offset if they didn't have any offset set (NULL). This affected mainly very old deployment which were uploaded before we enforced adding a UTC offset.

One will have to create a new export to make sure all timestamps have offsets. Closing issue.