inbo / bird-tracking

🛰🐦 Bird tracking - GPS tracking network for large birds
MIT License
20 stars 7 forks source link

Check Movebank summary numbers #119

Closed peterdesmet closed 4 years ago

peterdesmet commented 5 years ago

MH_WATERLAND

animal-id tag-id records start  end file tag  animal
L143473 6059 950 2016-06-01 13:45:36 2016-06-08 08:37:44 ok ok ok
L143472 6058 85924 2016-05-02 13:12:02 2017-07-15 19:07:46 ok ok ok
L143467 630 31070 2015-05-26 09:37:20 2016-03-25 23:32:02  ok ok (2 animals)  ok
L143457 623 62297 2013-07-22 10:36:10 2014-09-01 12:03:19 ok ok ok
L143451 610 183985 2013-06-25 09:47:16 2018-07-28 08:51:38  ok ok ok
H185298 630 475 2016-06-03 11:31:07 2016-06-13 12:06:54  ok ok (2 animals) ok
H173481 586 13209 2013-05-16 20:01:27 2013-08-02 09:33:37  ok ok ok

MH_ANTWERPEN

animal-id tag-id records start  end file tag  animal
H171693 1603 2690(2) 2018-07-18 10:45:39 2018-07-27 14:07:48 ok (2688) ok (2688 + 2)   ok (2688 + 2)
H197169 6330 28181 2019-04-18 09:32:15 2019-07-30 20:10:37 ok ok ok
L177801 6329 17046 2019-05-16 07:41:08 2019-07-13 20:14:41 ok ok ok

H_GRONINGEN

animal-id tag-id records start  end file tag  animal
5325667 6064 781906 2014-07-04 15:17:46 2018-07-11 11:10:40 dates/records off dates/records 0 dates/records 0
5327085 6065 21337 2016-06-18 07:59:05 2016-09-03 10:35:04 dates off dates off (deploy issue)  dates/records off (deploy issue)
5336455 634 5420 2014-06-04 22:03:20 2014-06-23 18:37:02 dates off dates off (deploy issue) dates/records off (deploy issue)
5446465 291 178914 2012-05-10 15:39:00 2016-08-11 17:08:34 dates/records off  dates/records 0  dates/records 0
peterdesmet commented 5 years ago

@sarahcd, above is a summary of the discrepancies in the records and dates in the source data and how these are presented on Movebank for the H_GRONINGEN study (the other two studies are fine). I think the issue might be related to the deployments and how to resample records based on wider deployments. Could you let me know how to solve this?

peterdesmet commented 5 years ago

@sarahcd, see also #113

sarahcd commented 5 years ago

It looks like the study has been updated since you posted these numbers, is the issue resolved? These summary numbers are calculated based on the timestamps of events in the study, the deployments and deploy on/off timestamps if set, and number of flagged outliers (summarized in visible=false). So if deployment times or the imported timestamps are wrong, you'll want to correct those. Currently the study contains deploy-off dates from 2021 and 2024, and you also mention future dates in #118. If this is not intended, probably the months and dates were swapped in mapping the timestamps.

Let me know if any of that sets you on the right track!

peterdesmet commented 5 years ago

Currently the study contains deploy-off dates from 2021 and 2024

Yes, or undefined. These are deploy-off dates as stored for the tags in the UvA-BiTS database (and I'm fine with having them set to a date in the future) and uploaded as such as movebank_ref_data.csv. This is reflected correctly in the Movebank deployments.

The problem is that the Movebank summary numbers for the gps data are incorrect. I've looked again at the files I've uploaded (downloaded the original files) and they are correct. They do not contain dates in the future. Despite that, the Time of First Location and Time of Last Location for those files are incorrectly reported by Movebank (and as a result probably, the number of records).

See left (correct numbers) and right (Movebank summary numbers):

5325667

781906 vs 774873 2014-07-04 15:17:46 vs 2014-01-08 00:57:42.000 2018-07-11 11:10:40 vs 2020-07-05 23:57:37.000

5327085

21337 vs 21337 OK 2016-06-18 07:59:05 vs 2016-01-07 00:03:48.000 2016-09-03 10:35:04 vs 2018-07-08 23:58:37.000

5336455

5420 vs 5420 OK 2014-06-04 22:03:20 vs 2014-04-06 22:03:20.000 (looks like a month switch, but since it only occurs here I don't think it's the actual bug) 2014-06-23 18:37:02 vs 2015-11-06 18:37:02.000

5446465

178914 vs 178913 2012-05-10 15:39:00 vs 2012-01-06 00:09:40.000 2016-08-11 17:08:34 vs 2018-07-07 23:52:10.000

Should I remove all data and start over or is there something you can do on your site? 🤷‍♂

sarahcd commented 5 years ago

Looks like the study has changed since you posted this (the notification was in my spam folder!) so I'm not sure what the current status is. In any case, I can see that the number of deployed locations in the summary statistics are very clearly wrong, so will report this and CC you.

For general reference, since usually unexpected statistics are not caused by a problem with the calculations, here are things to consider:

peterdesmet commented 5 years ago

Hi @sarahcd, just a heads up that I could upload the acceleration data without any problem and that the summary stats for those are correct. The issue with the gps summary data remains.

peterdesmet commented 4 years ago

The remainder of this issue is described in #113, closing here