Closed leothomas closed 1 year ago
I have come to face, yet again, every developer's mortal enemy: timezones.
The recordings span several different timezones. The best practice for storing data which spans timezones (eg: credit card purchases) is to store the UTC timestamp in the database, and convert the timestamp to the user's local timezone in the frontend code.
This works well for most cases, since it references all local times against a singe timezone (UTC). For example, if I run an online shop from New York and have customers all around the world, I want to be able to query all orders made between 2020-05-30T00:00:00 and 2020-05-31T00:00:00, I just have to convert the start and end datetimes to a UTC timestamp (ie: 2020-05-30T05:00:00 - 2020-05-31T05:00:00, since New York is 5hrs behind UTC) and compare that to all items in the database, which are also stored in UTC! I won't accidentally miss orders from Australia and New Zealand which were made on 2020-05-29T20:00:00 (local time) or include orders made in california 3 hours after the cutoff.
However that doesn't work to find bird calls at sunrise, since sunrise in New York doesn't occur at the same time as sunrise in California. So in order to find all bird calls at sunrise we need to store the time of each recording in its local timezone.
If it weren't for the requirement to store each recording in its own timezone, we would be able to use a modulus operation directly on the UTC timestamp.
In the context of the online shop I would be able to find all orders received between 8am and 9am every day with the condition: 60*60*8 <= TIMESTAMP % (60*60*24) <= 60*60*9
.
Possible metadata configurations:
2020-01-01T00:00:00+1000
)+10
) --> NOTE: this is a a pretty bad idea since most timezones don't keep the same offset throughout the year (with daylight savings)@leothomas noting that timezone conversion, and even worse - Australian daylight savings time!!! -- is apparently the bane of most eco-audiologists as well.
Many Australian regions do observe daylight savings, but some do not. No animals observe daylight savings.
This looks like so much fun, right? From wikipedia
Is there any sense of the data already being normalized for UTC?
It looks like the A2O acoustic workbench does have some consideration here for audio upload: https://github.com/QutEcoacoustics/audio-analysis/blob/e5756e14227b98d84c8f560333e4160e90a9e1c6/docs/basics/dates.md?plain=1#L26
TODO: Add seconds-past-civil-(twilight/dawn) metadata fields to all recordings. Don't forget the leap seconds! (j/k)
This does sound like a good question for Anthony + Paul.
Found the message from A2O I was looking for, from the "Filter by Time of Day" option available on any recording (e.g. https://data.acousticobservatory.org/projects/1/regions/72/points/285/audio_recordings)
Awesome! Thanks @LanesGood. In our case, I only have access to the UTC offset in the filename rather than the timezone itslef, so I wouldn't be able to know wether or not that timezone is currently observing DST. For example, timezone A is UTC+10 when not DST and UTC+11 when observing DST, timezone B is UTC+11 and does not observe DST. If I have a filename with UTC+11, I can't know wether it was recorded in timezone A during DST or in timezone B. I think we should document that limitation
Time of day is now available as part of the metadata. Thank you @leothomas!
The metadata currently contains a timestamp (number of seconds since 1970-01-01T00:00:00), which enables date range searches (eg: recordings from Nov 13th to Dec 16th, or recordings from 10:00 to 10:30 in Nov 14th). Adding an offset in seconds from 00:00:00 would allow searches of the type: all recordings that occur between 8:00 and 9:00, regardless of the date.