bids-standard / bids-specification

Brain Imaging Data Structure (BIDS) Specification
https://bids-specification.readthedocs.io/
Creative Commons Attribution 4.0 International
276 stars 161 forks source link

add field for timezone #1968

Open JojoVh opened 1 week ago

JojoVh commented 1 week ago

Your idea

Dear BIDS developers

I would like to raise the issue that there is a need for timezone specification. If the time indicator 'Z' for UTC time is lacking, the time zone is assumed to be local time. But this is not necessarily true when working internationally, and causes time zoning issues.

I would like to suggest adding this in the corresponding json file with the field "timezone", which SHOULD be specified.

{ "acq_time":{ "Description":"date and time of the scan acquisition in the format YYYY-MM-DDThh:mm:ss", "Units":"seconds" "Timezone":"CET" } }

as specified in this list: https://www.utctime.net/time-zone-abbreviations Would that be possible to implement?

I am referring to an adaptation of the current section: https://bids-specification.readthedocs.io/en/stable/common-principles.html#units Describing dates and timestamps:

Date time information MUST be expressed in the following format YYYY-MM-DDThh:mm:ss[.000000][Z] (year, month, day, hour (24h), minute, second, optional fractional seconds, and optional UTC time indicator). This is almost equivalent to the RFC3339 "date-time" format, with the exception that UTC indicator Z is optional and non-zero UTC offsets are not indicated. If Z is not indicated, time zone is always assumed to be the local time of the dataset viewer. No specific precision is required for fractional seconds, but the precision SHOULD be consistent across the dataset. For example 2009-06-15T13:45:30.

Remi-Gau commented 1 week ago

@JojoVh this is a bids specification issue, so I will move this to the right repo.

Remi-Gau commented 1 week ago

So this would be a recommended field for the description acq_time in scans.json or sessions.json files?

effigies commented 1 week ago

What is the use case for specifying time zone? The two use cases we considered are wanting absolute time coordinates and wanting to preserve time of day information. For the latter, the time zone is not generally relevant.

If we do add timezones, I would probably prefer just to use full RFC3339.

JojoVh commented 5 days ago

Dear @Remi-Gau and @effigies

Thank you very much for your enthusiasm and feedback.

I would recommend the field in the scans.json the most, because the timepoint of scanning matters most, but it makes sense to also recommend it in the sessions.json

The use-case for this is the following: For a BIDS dataset recorded at 1 center, it is logical to us to keep the local time in the files, as it corresponds to time-sensitive data such as medication intake, artifact annotation, and the curation of the file with the people outside of research, such as our collaborators at the clinic, who are using local time.

However, toolboxes such as mne and mne_bids make a distinction between local time and UTC. If UTC is not specified with the trailing Z, the time is read in the time zone of the computer. We found out that in the context of data sharing and international collaboration, it caused significant issues with shifts in artifact annotations and in shift in chronic data (e.g. to determine day/night differences).

Shifting the entire dataset to UTC is not an option for us, but a standard field for the local time does not exist yet.

(As a maintainer, my colleague aims to have such a field read in by mne_bids when specified)

What is your opinion? Would such a field be possible to take up in the BIDS standard?

Best wishes Jojo Vanhoecke

ICN, Department of Neurology, Charité Universitätsmedizin - Berlin

effigies commented 4 days ago

Ah, thank you for making that clearer. Yes, if you need time-of-day and between-TZ timedeltas, then that's going to matter.

It looks like we specify in text that only Z and the empty string are permitted, but the regular expression we use actually permits /[A-Z]{2,4}/, which are alphabetic timezone codes.

IMO we should simply fully adopt RFC3339, which uses

   date-fullyear   = 4DIGIT
   date-month      = 2DIGIT  ; 01-12
   date-mday       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
                             ; month/year
   time-hour       = 2DIGIT  ; 00-23
   time-minute     = 2DIGIT  ; 00-59
   time-second     = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap second
                             ; rules
   time-secfrac    = "." 1*DIGIT
   time-numoffset  = ("+" / "-") time-hour ":" time-minute
   time-offset     = "Z" / time-numoffset

   partial-time    = time-hour ":" time-minute ":" time-second
                     [time-secfrac]
   full-date       = date-fullyear "-" date-month "-" date-mday
   full-time       = partial-time time-offset

   date-time       = full-date "T" full-time

This does not use time zone codes but whole-minute UTC +/- offsets. Would this satisfy, or is the use of named timezone codes in JSON particularly important?

JojoVh commented 2 days ago

Dear @effigies Thank you for your reply.

I am happy to read, that you permit already to read the alphabetic timezone codes with your regex. In general, I also like your suggestion to adopt RFC3339.

Yes, I think that the documentation of the used timezone code is important. The reason for this is, that I think that documenting the UTC +/- offset at point of the creation of new recordings is a potential pitfall, which is different during Winter Time or Summer Time. From a practical point of view, I can hardly imagine that the person creating the BIDS dataset would put effort in documenting separately for each recording the variable UTC offsets. I think, that is not elegant.

If the BIDS specification however determines homogeneously where to document the timezone with standard abbreviations (https://www.utctime.net/time-zone-abbreviations) in scans or sessions.json, a toolbox like mne_bids would be able to read in the timezone and interpret the different date-time objects (from scans.tsv and sessions.tsv) with the correct UCT offset. (This can be different for each recording).

What is your opinion?