nipy / heudiconv

Flexible DICOM conversion into structured directory layouts
https://heudiconv.readthedocs.io
Other
236 stars 126 forks source link

Use AcquisitionTime for acq_time column of scans tsv files #450

Closed tsalo closed 4 years ago

tsalo commented 4 years ago

Summary

It looks like heudiconv uses ContentTime in the scans tsv files instead of AcquisitionTime, although I think that AcquisitionTime would be more appropriate (i.e., closer to the actual start of the scan).

Quoting @pvelasco in https://github.com/physiopy/phys2bids/pull/219#issuecomment-626899271:

In dicoms, SeriesTime, InstanceCreationTime, and ContentTime are very similar, but AcquisitionTime is generally ~20 seconds earlier than the rest. We need to determine which time is closest to the scanner's trigger pulse.

From some documentation I found from Siemens on image reconstruction:

AcquisitionTime: The Time the acquisition of data started.

From an answer given by Chris Roden (dcm2niix):

Instance Creation Time 0008,0013 initially reflects the time an image leaves a parallel processing reconstructor, so it has a variable delay with regards to the acquisition time. Acquisition Time is more meaningful for data.

So, if the reconstruction involves something more than a straight FFT (i.e., GRAPPA, SENSE, etc.), that will explain the ~20 sec delay you find between the AcquisitionTime and the other times. If I remember correctly, for Siemens scanners, the "scanner trigger" is sent at the time of the excitation pulse of the first slice of each (collected) volume. So the very first trigger will be sent just a few milliseconds before the first chunk of data for the first slice for the first volume is acquired (AcquisitionTime).

This issue is related to #447.

yarikoptic commented 4 years ago

Interesting, in 0.4 we "correct date/time in BIDS _scans files", will check later from what other field we switched

tsalo commented 4 years ago

It looks like it was switched from SeriesTime in #73 a few years ago.

yarikoptic commented 4 years ago

looking at a sample (well -- abit atypical - the PD+T1w):

/home/yoh/datalad/dicoms/dartmouth-phantoms/bids_test6-PD+T2w/005-anat-PD+T2w_acq-tse-tra-3mm

$> grep AcquisitionTime __anat-PD+T2w_acq-tse-tra-3mm_20180226092619_5_e*json
__anat-PD+T2w_acq-tse-tra-3mm_20180226092619_5_e1.json: "AcquisitionTime": "09:32:3.255000",
__anat-PD+T2w_acq-tse-tra-3mm_20180226092619_5_e2.json: "AcquisitionTime": "09:32:3.302500",

$> grep -e ContentTime -e AcquisitionTime *dump | head            
000001.dcm.dump:(0008,0032) TM [093203.255000]                          #  14, 1 AcquisitionTime
000001.dcm.dump:(0008,0033) TM [093606.067000]                          #  14, 1 ContentTime
000002.dcm.dump:(0008,0032) TM [093205.107500]                          #  14, 1 AcquisitionTime
000002.dcm.dump:(0008,0033) TM [093606.146000]                          #  14, 1 ContentTime
000003.dcm.dump:(0008,0032) TM [093203.360000]                          #  14, 1 AcquisitionTime
000003.dcm.dump:(0008,0033) TM [093606.072000]                          #  14, 1 ContentTime
000004.dcm.dump:(0008,0032) TM [093205.210000]                          #  14, 1 AcquisitionTime
000004.dcm.dump:(0008,0033) TM [093606.150000]                          #  14, 1 ContentTime
000005.dcm.dump:(0008,0032) TM [093203.462500]                          #  14, 1 AcquisitionTime
000005.dcm.dump:(0008,0033) TM [093606.076000]                          #  14, 1 ContentTime

$> grep -e ContentTime -e AcquisitionTime *dump | tail
000066.dcm.dump:(0008,0032) TM [093204.845000]                          #  14, 1 AcquisitionTime
000066.dcm.dump:(0008,0033) TM [093606.136000]                          #  14, 1 ContentTime
000067.dcm.dump:(0008,0032) TM [093206.697500]                          #  14, 1 AcquisitionTime
000067.dcm.dump:(0008,0033) TM [093606.227000]                          #  14, 1 ContentTime
000068.dcm.dump:(0008,0032) TM [093204.947500]                          #  14, 1 AcquisitionTime
000068.dcm.dump:(0008,0033) TM [093606.140000]                          #  14, 1 ContentTime
000069.dcm.dump:(0008,0032) TM [093206.800000]                          #  14, 1 AcquisitionTime
000069.dcm.dump:(0008,0033) TM [093606.231000]                          #  14, 1 ContentTime
000070.dcm.dump:(0008,0032) TM [093205.052500]                          #  14, 1 AcquisitionTime
000070.dcm.dump:(0008,0033) TM [093606.144000]                          #  14, 1 ContentTime

For syncing phys recordings, as I have mentioned in our chat at AFNI hackathon, I think we need some markers to get clocks in sync, unless phys signal has consistent artifacts from MR so we could identify beginning/end of sequence. Relying on times alone would never be sufficient I am afraid regardless if it is Acquisition or Content Time. any of those could be used to only roughly identify and match sequences. And given that there is no ContentTime in sidecar .json, may be it is even better to keep that one in _scans to complement AcquisitionTime in sidecar? or may be we better make sure to extract and place all Time's (not just min or max of any) into sidecar file?

pvelasco commented 4 years ago
  • discrepancy is quite large (4 minutes?). I wonder if there is also contribution from having multiple clocks involved here

  • dcm2niix stores the first? (smallest) AcquisitionTime in .json

TL;DR You should use the smallest AcquisitionTime

Long answer:

I guess the ContentTime is the time at which the DICOM file was created (I haven't found a definition for it), while the AcquisitionTime is the time the first chunk of data for that DICOM file was acquired (this was from some official Siemens documentation on image reconstruction). In your sample dataset, it is a 2D, non-EPI Turbo Spin-Echo acquisition with two echoes/contrasts. Non-EPI means that you collect the phase-encoding lines one by one and, since you cannot reconstruct the image for that slice until you have all the phase-encoding lines, there will be a significant delay between the acquisition time (collection of the first line) and the generation of the image. (Back of the envelop calculation: it will be TR x N_PE / TurboFactor, with N_PE: number of phase-encoding lines.) It will be close to the lScanTimeSec (234 s) in your sample.

  • dcm2niix stores the first? (smallest) AcquisitionTime in .json

I assume so: the smallest acquisition time for all the DICOM files that go into a given NIfTI image. Because the PD- and the T2-weighted images come from different DICOM files, the AcquisitionTime for each one will be the minimum AcquisitionTime of the respective DICOM files. Since they come from data collected at different echo times, the difference in acquisition times should be the difference in echo times (TE2-TE1 = 94 - 9.4 = 84.6 ms). I'm not sure why you get about half of it...

tsalo commented 4 years ago

For syncing phys recordings, as I have mentioned in our chat at AFNI hackathon, I think we need some markers to get clocks in sync, unless phys signal has consistent artifacts from MR so we could identify beginning/end of sequence. Relying on times alone would never be sufficient I am afraid regardless if it is Acquisition or Content Time. any of those could be used to only roughly identify and match sequences. And given that there is no ContentTime in sidecar .json, may be it is even better to keep that one in _scans to complement AcquisitionTime in sidecar?

I have had success just using the relative times (i.e., working off of the exact times between scans and between physio trigger onsets in their own respective time series to figure out how they match up across time series). You can see in https://github.com/physiopy/phys2bids/pull/219#issuecomment-627418772 a couple of figures showing how the method performs on an example scanning session. That said, I can grab AcquisitionTime from json files if necessary. In fact, that's currently how the method is set up. But it would be easier to grab the scan tsv file. Plus, given what @pvelasco has said, it just seems like AcquisitionTime is the most appropriate time to keep.

tsalo commented 4 years ago

Just wanted to ping this so I can make some progress on physiopy/phys2bids#219. Would folks feel comfortable with me opening a PR?

yarikoptic commented 4 years ago

thank you @tsalo for the ping! I think we

tsalo commented 4 years ago

Okay, I can open an issue/PR to the BIDS specification. Thanks!

tsalo commented 4 years ago

Actually, after looking at the specification I'm wary of trying to prescribe the field used for acq_time in the scans file. The scans file is generic across modalities, so does it make sense to say which field in the raw data of one specific modality corresponds to a field in the file?

pvelasco commented 4 years ago

You could define the field acq_time as the time at which the acquisition of a file started. In the case of Siemens scanners, that corresponds to the smallest AcquisitionTime in all the DICOM files that contribute to a give NIfTI file. I don't know for other vendors.

tsalo commented 4 years ago

The manufacturer is something I haven't considered. @pvelasco do you have any idea who might know about other vendors?

pvelasco commented 4 years ago

I think @neurolabusc, or he might know somebody who knows.

yarikoptic commented 4 years ago

The scans file is generic across modalities, so does it make sense to say which field in the raw data of one specific modality corresponds to a field in the file?

I think there should be some wording in BIDS which would at least say something like "time when the first data point (could be a value in a time series, an MRI slice, etc) was acquired" -- something specific. Now it is not really clear what it corresponds to - beginning or the end... Someone could argue though that "end time" (the last data point) is a better time to keep track of, since then beginning of the data collection could be computed taking the full data time range. With "first data point" it might be a bit unclear since it doesn't account for that data point (e.g. a slice) actual acquisition duration, so it is really the time point AFTER acquisition of the first data sample.

neurolabusc commented 4 years ago

For input, are we talking specifically about Siemens data here?

I wrote some notes on this several years ago:

Siemens physiological files are not recorded precisely at 50Hz, and the physiological recordings store the start and end times for two clocks: the MPCU (physio clock) and MDH (DICOM clock). For our purposes, it is critical that we align the physiological times to the DICOM MRI data (saved with the MDH clock) to the MDH values stored in the physiological log files... tag 0008,0032 from this image, which stores the milliseconds since midnight according to the DICOM clock.

My memory is that the MPCU clock drifts a lot relative to the MDH. So failure to choose the correct clock could have profound consequences.

Siemens fMRI data from systems VA-VE are typically stored as mosaics, with one acquisition time 0008,0032 for the entire volume (the volume start time). The CSA file provides the MosaicRefAcqTimes which reports the slice times relative to this value. I would raise one caveat, which is Siemens Motion Correction (MoCo). I never use this, as I would prefer to model motion parameters offline. However, be aware that MoCo can generate negative slice time values. I am really not sure how the presence of negative slice times in MosaicRefAcqTimes should be interpreted with respect to the Physio clock.

neurolabusc commented 4 years ago

For completeness, here is a public example showing that the DICOM AcquisitionTime (0008,0032) for the first slice of the first volume of a fMRI series is preserved in the BIDS json file:

>dcmdump MR.1.3.12.2.1107.5.2.32.35131.2014031012525641770887330
...
(0008,0032) TM [135252.445000]                          #  14, 1 AcquisitionTime

>more ax_asc_36sl_9.json
...
        "AcquisitionTime": "13:52:52.445000",

>fslhd ax_asc_36sl_9.nii
descrip     TE=30;Time=135252.445;phase=1
neurolabusc commented 4 years ago

I looked at the code for @ostanley physio2bids and @pitchaim physio2bids. Both assume that the sampling rate stored in the .puls and .resp files is precise, with no drift relative to the DICOM clock. Specifically, if the file lists PULS_SAMPLES_PER_SECOND = 50 theses tools assume a precise value of 50 Hz. This was not my experience when I investigated this. To determine sample rate with respect to the DICOM clock, one examines the number of samples (make sure not to include triggers) and then use the formula (assuming first start time is first sample and end time is last sample, we put in the -1 to avoid a Fencepost error):

SampleRate = (Samples -1)/(LogStopMDHTime-LogStartMDHTime)

The times of the triggers must be calculated with respect to this sample rate to avoid drift. Since the MDH times are with respect to the DICOM clock, one only needs to read the AcquisitionTime for the start of the series (saved in the BIDS JSON file) to know the time of all triggers relative to the DICOM images in the series.

tsalo commented 4 years ago

@neurolabusc Thank you for the info about the physio data! On the physio side of things, I'm primarily working with BioPac data, where calculating sample rate like that wouldn't be possible since timestamps aren't included. Still, in the future when phys2bids (hopefully) supports scanner-generated physio data as well, it will be good to account for drift.

For completeness, here is a public example showing that the DICOM AcquisitionTime (0008,0032) for the first slice of the first volume of a fMRI series is preserved in the BIDS json file:

Does that mean that AcquisitionTime is standard for DICOMs from all vendors?

neurolabusc commented 4 years ago

Yes, the tag AcquisitionTime (0008,0032) is specified in the DICOM standard. It has a Value Representation (VR) of TM: A string of characters of the format HHMMSS.FFFFFF. For EPI data like fMRI the convention seems to be storing the time for the first slice in the volume for all slices in the 3D volume. This is unfortunate, as otherwise we could use it to determine slice timing. I think Siemens XA-series is an exception, where each slice gets its own time stamp. However, this new format is changing and time stamps were corrupted for XA data where the user chose to anonymize the images from the console.

tsalo commented 4 years ago

That's good to hear. So then it makes sense to use the minimum AcquisitionTime across dicoms within a scan for the acq_time field in the BIDS scans file.

It's unfortunate that AcquisitionTime lacks slice-level resolution, but getting the time for the first slice is actually what we need for heudiconv, so that's good at least. Thank you for your help!

tsalo commented 4 years ago

@yarikoptic I just opened https://github.com/bids-standard/bids-specification/issues/503. I can open an associated PR in the next couple of days.

pvelasco commented 4 years ago

Still, in the future when phys2bids (hopefully) supports scanner-generated physio data as well, it will be good to account for drift.

@tsalo, I have a tool (bidsphysio) that reads a variety of different physiology files (BioPac, DICOM from CMRR multiband EPIs and PMU (Siemens scanner-generated)) and saves them in BIDS physio format. You might want to take a look at it.

After looking at your PR in phys2bids (physiopy/phys2bids#219), I'm working on the synchronization of the physio files and the images, for our use case: when you have one physio file per functional run (a little bit different from yours). I hope you don't mind if I use your conversion.py function...

tsalo commented 4 years ago

@pvelasco lol I have glanced at bidsphysio before because of #446, but never actually looked through it because the repo's description just says "Converts physio data from a CMRR DICOM file". You should definitely update that to match what's in the README, because I had no idea it worked on BioPac and non-CMRR Siemens files too!

Absolutely, feel free to use conversion.py, although I don't know how effective it will be with only one run. Be forewarned that I'll also be working with the phys2bids devs to try to get that logic supported there too.