matiasandina / fed3

An R package to read and analyze data from FED3 devices
https://matiasandina.github.io/fed3/
Other
0 stars 0 forks source link

Consider Checking for duplicated data #10

Open matiasandina opened 1 year ago

matiasandina commented 1 year ago

We have currently implemented deduplicate_datetime(). The goal is to fix duplicate timestamps that arise if events happen too close to each other (the timestamp precision is in seconds).

Duplicated data (real duplicated events) can happen when binding fed data from a single session. For example, you can generate this issue if you copied the data from the SD card onto the computer multiple times without turning on the device. If you overwrite the data, duplicates will not happen, but if you have them in separate folders (e.g., habituation,day1, day2), then you have duplicates and we are currently not checking for that.

Because we don't check, it affects all items on the pipeline. One issue is that read_fed() would not encounter duplicated data, it would only happen when we bind_rows(). So the user would need to remember.

It's possible that this is such an edge case that it doesn't truly happen in practice, but I am just setting the record here to explain how it could potentially happen and to state that, maybe we should check for it