Open MichaelUM opened 2 years ago
This issue was discussed in a meeting of 17/04/2024 and it was agreed that this is a deficiency in the specification.
Ideally, we would like to specify that in situations where it does not make sense to index into the wavelengths array, that the index field no longer be required. This is likely limited to situations where the data type is one of a chromophore concetration. Implementation TBC.
I note that when we are in the 'chromophore space' rather than the 'wavelength space', all practical datasets will likely express every channel over the same set of chromophores, and thus a 'chromophore index' (or general spectroscopic index) could be implemented in a future revision which acts in the same way as the wavelength index.
Here is the example I brought up in our discussion. It is a file with processed data from Satori. Shown is the measurement list of a data element which contains both wavelength-dependent (RAW, dOD) and chromophore (HbO, HbR, HbT) data. In such a scenario the wavelengthIndex and datatypeIndex fields may be required for some entries but not for all. In the latter case a default / not used value is needed. I guess this is the scenario @MichaelUM had in mind.
In [20]: cedalion.io.snirf.measurement_list_to_dataframe(s.nirs[0].data[0].measurementList, drop_none=True)[::10]
Out[20]:
sourceIndex detectorIndex wavelengthIndex dataType dataTypeLabel dataTypeIndex
0 1 1 1 99999 RAW 1
10 4 2 1 99999 RAW 1
20 6 6 1 99999 RAW 1
30 9 20 1 99999 RAW 1
40 12 14 1 99999 RAW 1
50 15 15 1 99999 RAW 1
60 3 1 2 99999 RAW 1
70 5 7 2 99999 RAW 1
80 8 8 2 99999 RAW 1
90 11 12 2 99999 RAW 1
100 14 13 2 99999 RAW 1
110 1 16 1 99999 dOD -1
120 4 6 1 99999 dOD -1
130 7 5 1 99999 dOD -1
140 10 10 1 99999 dOD -1
150 13 11 1 99999 dOD -1
160 16 15 1 99999 dOD -1
170 3 4 2 99999 dOD -1
180 6 4 2 99999 dOD -1
190 9 9 2 99999 dOD -1
200 12 10 2 99999 dOD -1
210 14 15 2 99999 dOD -1
220 2 2 -1 99999 HbO -1
230 5 3 -1 99999 HbO -1
240 7 8 -1 99999 HbO -1
250 11 9 -1 99999 HbO -1
260 13 22 -1 99999 HbO -1
270 1 1 -1 99999 HbR -1
280 4 2 -1 99999 HbR -1
290 6 6 -1 99999 HbR -1
300 9 20 -1 99999 HbR -1
310 12 14 -1 99999 HbR -1
320 15 15 -1 99999 HbR -1
330 3 1 -1 99999 HbT -1
340 5 7 -1 99999 HbT -1
350 8 8 -1 99999 HbT -1
360 11 12 -1 99999 HbT -1
370 14 13 -1 99999 HbT -1
Yes @emiddell, I hope this feature was sort of intended in the format. I'm not sure but it allowed us to store all information in one file.
A solution thus seems to be to just clarify in the spec that a wavelength index is only needed for dataTypes for which a wavelength needs to be specified and that for all other dataTypes, it is not required to specify the wavelength index.
So, we need to update the spec to clarify this... and we need to verify that the validator(s) handle this condition.
Following meeting of May 24 it was decided that:
Probably this is already mentioned somewhere but I couldn't find it directly in the specs (only in #40, but not integrated). If required fields are not used they can't be empty and a default / not used value should be set instead. @dboas @jayd1860 suggested -1 for integers for example for wavelengthIndex=-1 and dataTypeIndex=-1 in preprocessed data. Strings could be handled as empty strings in general. It would be great if this could be stated in the spec. directly.