Donders-Institute / data-streamer

A service and web UI for managing lab data flowing to the DCCN project storage and the Donders Repository
https://uploader.dccn.nl
MIT License
1 stars 3 forks source link

Make use of subject and session input text fields instead of dropdown lists. #24

Closed aardkronkel closed 4 years ago

aardkronkel commented 4 years ago

In the streamer-ui, make use of subject and session input text fields instead of dropdown lists.

Note: the cascading selection fields should be same as in the calendar booking form: [Project, Subject, Session, Modality]

aardkronkel commented 4 years ago

See commit ca5bb88ada3ccd38662847bf3a551f60328cd0e3. Now we expect 1- or 2-digit number for subject and session label.

robertoostenveld commented 4 years ago

Let me also ping @marcelzwiers on this.

It should be possible to specify both subject and session manually using the form [a-zA-Z0-9]*, i.e. repeats of characters and/or digits. Right now this is already used for session (which is usually mri01 or meg01). For subjects it is needed so that project specific identifiers can be used, e.g. such as those used in POM.

aardkronkel commented 4 years ago

For your information: I currently use /project/{projectnumber}/raw/sub-{subjectlabel}/ses-{datatype}{sessionlabel} as the project storage destination path. Currently, {subjectlabel} and {sessionlabel} have the form ^[0-9]{1,2}$ (i.e. 1- or 2-digit input number being converted to output 2-digit number with leading zero if needed). {datatype} can be one the following: mri, meg, eeg, ieee, beh, or a user specified string of form ^[a-z]+$ (i.e. lowercase string consisting of 1 or more characters, no special characters).

See also: https://github.com/Donders-Institute/data-streamer/blob/master/docker/streamer-ui/docs/screenshot-data-streamer-ui.png

robertoostenveld commented 4 years ago

Then that needs to be fixed: 1) datatype should not be part of the sessionlabel, the user has to specify himself that the data is recorded in the mri01 session (e.g. for presentation files), or in a separate behavioural session (which the user might want to represent as beh01). 2) subjectlabel and sessionlabel don't have to be numeric (that restriction applies to the calendar, but should not apply to the labstreamer)

Within one session (e.g. a person coming to the mri lab, hence the sessionlabel mri01), multiple types of data can be recorded. On the target disk location those should be merged, for example as

root/sub-POM1783/ses-mri-01/01_localizer/
root/sub-POM1783/ses-mri-01/02_AAHead_Scout_32ch-head-coil/
root/sub-POM1783/ses-mri-01/...
root/sub-POM1783/ses-mri-01/06_t2_haste_sag_p2/
root/sub-POM1783/ses-mri-01/07_cmrr_RS_mbep2d_bold|_2.2x2.2_TR860/
root/sub-POM1783/ses-mri-01/08_diffusion_b1000_68dirs_1.8_slab/
root/sub-POM1783/ses-mri-01/09_t2_swi3d_tra_p2_0.7x0.7x0.7_TE20/
root/sub-POM1783/ses-mri-01/...
root/sub-POM1783/ses-mri-01/13_AAHead_Scout_32ch-head-coil/
root/sub-POM1783/ses-mri-01/...
root/sub-POM1783/ses-mri-01/17_tfl_mp2rage/
root/sub-POM1783/ses-mri-01/...
root/sub-POM1783/ses-mri-01/22_thal_svs_edit_859D-1.8x2.4x1.8_128av/
root/sub-POM1783/ses-mri-01/23_mot_svs_edit_859D-1.8x2.4x1.8_96av/
root/sub-POM1783/ses-mri-01/eeg/
root/sub-POM1783/ses-mri-01/beh/
root/sub-POM1783/ses-mri-01/emg/
root/sub-POM1783/ses-mri-01/eyetracking/
root/sub-POM1783/ses-mri-01/video/

The first part (starting with the scan number from 00 to 23) are the different MRI data types that are recorded and streamed automatically. The BIDScoiner will map them (later) to more meaningful datatypes, such as anat, func, dwi or fieldmap.

The latter ones (eeg, beh, eyetracking, etc) are already recognizable, since added by the researcher in the labstreamer webinterface.

So the user of the webinterface specifies

and the datafiles go to project/raw/sub-xxx/ses-yyy/zzz/...

robertoostenveld commented 4 years ago

Please read https://bids-specification.readthedocs.io/en/stable/02-common-principles.html and let me know if anything is unclear.

robertoostenveld commented 4 years ago

Oh, and note that for DCCN raw data, we organize it in a BIDS-like structure, not in a proper BIDS-structure (that only comes after bidscoiner or data2bids) . That is why the dataypes in the directory names deviate in some places, and why we don't impose file format restrictions. But the directory structure is largely similar to BIDS.

marcelzwiers commented 4 years ago

I think this is a good proposal, and allows for a simple extension into the current data flow (e.g. for simultaneous data-type recordings, as well as for free-format data types into ses-[free-format] session types.

Op di 27 aug. 2019 om 17:50 schreef Robert Oostenveld < notifications@github.com>:

Then that needs to be fixed:

  1. datatype should not be part of the sessionlabel, the user has to specify himself that the data is recorded in the mri01 session (e.g. for presentation files), or in a separate behavioural session (which the user might want to represent as beh01).
  2. subjectlabel and sessionlabel don't have to be numeric (that restriction applies to the calendar, but should not apply to the labstreamer)

Within one session (e.g. a person coming to the mri lab, hence the sessionlabel mri01), multiple types of data can be recorded. On the target disk location those should be merged, for example as

root/sub-POM1783/ses-mri-01/01_localizer/ root/sub-POM1783/ses-mri-01/02_AAHead_Scout_32ch-head-coil/ root/sub-POM1783/ses-mri-01/... root/sub-POM1783/ses-mri-01/06_t2_haste_sag_p2/ root/sub-POM1783/ses-mri-01/07_cmrr_RS_mbep2d_bold|_2.2x2.2_TR860/ root/sub-POM1783/ses-mri-01/08_diffusion_b1000_68dirs_1.8_slab/ root/sub-POM1783/ses-mri-01/09_t2_swi3d_tra_p2_0.7x0.7x0.7_TE20/ root/sub-POM1783/ses-mri-01/... root/sub-POM1783/ses-mri-01/13_AAHead_Scout_32ch-head-coil/ root/sub-POM1783/ses-mri-01/... root/sub-POM1783/ses-mri-01/17_tfl_mp2rage/ root/sub-POM1783/ses-mri-01/... root/sub-POM1783/ses-mri-01/22_thal_svs_edit_859D-1.8x2.4x1.8_128av/ root/sub-POM1783/ses-mri-01/23_mot_svs_edit_859D-1.8x2.4x1.8_96av/ root/sub-POM1783/ses-mri-01/eeg/ root/sub-POM1783/ses-mri-01/beh/ root/sub-POM1783/ses-mri-01/emg/ root/sub-POM1783/ses-mri-01/eyetracking/ root/sub-POM1783/ses-mri-01/video/

The first part (starting with the scan number from 00 to 23) are the different MRI data types that are recorded and streamed automatically. The BIDScoiner will map them (later) to more meaningful datatypes, such as anat, func, dwi or fieldmap.

The latter ones (eeg, beh, eyetracking, etc) are already recognizable, since added by the researcher in the labstreamer webinterface.

So the user of the webinterface specifies

  • project from list
  • subject=xxx as [a-zA-Z0-9]*
  • session=yyy as [a-zA-Z0-9]*
  • datatype=zzz from list, optionally other as [a-z]* and the datafiles go to project/sub-xxx/ses-yyy/zzz/...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Donders-Institute/data-streamer/issues/24?email_source=notifications&email_token=ADTUGL3MBCNU6SUOPAC4EPDQGVEMBA5CNFSM4IO7E3OKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5IGW5Y#issuecomment-525364087, or mute the thread https://github.com/notifications/unsubscribe-auth/ADTUGLZJUDQTFB2N6ZHG4CDQGVEMBANCNFSM4IO7E3OA .

marcelzwiers commented 4 years ago

I can't remember how I constructed the regexp anymore (:-p), and I think the replace command is redundant, but this is what I use to clean-up user-entered strings that have to be used as bids values:

def cleanup_value(label): """ Converts a given label to a cleaned-up label that can be used as a BIDS label. Remove leading and trailing spaces; convert other spaces, special BIDS characters and anything that is not an alphanumeric to a ''. This will for example map "Joe's reward_task" to "Joesrewardtask"

:param label:   The given label that potentially contains undesired

characters :return: The cleaned-up / BIDS-valid label """

if label is None:
    return label

special_characters = (' ', '_', '-','.')

for special in special_characters:
    label = str(label).strip().replace(special, '')

return re.sub(r'(?u)[^-\w.]', '', label)

Op di 27 aug. 2019 om 15:43 schreef Robert Oostenveld < notifications@github.com>:

Let me also ping @marcelzwiers https://github.com/marcelzwiers on this.

It should be possible to specify both subject and session manually using the form [a-zA-Z0-9]*, i.e. repeats of characters and/or digits. Right now this is already used for session (which is usually mri01 or meg01). For subjects it is needed so that project specific identifiers can be used, e.g. such as those used in POM.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Donders-Institute/data-streamer/issues/24?email_source=notifications&email_token=ADTUGL5R2F4B2OJTUCKNBILQGUVRLA5CNFSM4IO7E3OKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5HY65Q#issuecomment-525307766, or mute the thread https://github.com/notifications/unsubscribe-auth/ADTUGLYZI2FXJUNIXN23Q7TQGUVRLANCNFSM4IO7E3OA .

aardkronkel commented 4 years ago

@robertoostenveld and @marcelzwiers Thank you for the clarification.

I have some questions concerning the xxx, yyy and zzz strings:

  1. Do I understand correctly that for the example root/sub-POM1783/ses-mri-01/22_thal_svs_edit_859D-1.8x2.4x1.8_128av/ gives xxx = POM1783 , yyy = mri-01 and zzz = 22_thal_svs_edit_859D-1.8x2.4x1.8_128av?

  2. Using regex [a-zA-Z0-9]* allows yyy = mri01 but not yyy = mri-01. Should we make it [a-zA-Z0-9\-]* instead to include dashes (i.e. -)?

  3. In this example, zzz is not of the form [a-z]*?

  4. Using regex [a-zA-Z0-9]* and [a-z]* implies empty strings are also allowed. Should we use [a-zA-Z0-9]+ and [a-z]+ instead (i.e. one or more characters instead of zero or more characters)? If not, how to deal with the empty strings (i.e. in the target destination path)?

  5. Perhaps we should not use a cleanup_value kind of function in the data streamer UI like Marcel suggests? As Robert mentions this should be dealt with the conversion using data2bids or bidscoiner at a later stage?

robertoostenveld commented 4 years ago

1) yes, except that the dash in mri-01 is a typo that I made in a previous comment. The session should be mri01. 2) no, dashes are not allowed. There are used to separate the key from the value in the key-value list in the filename. E.g. key1-value1_key2-value2. Underscores separate the key-value pairs from each other. 3) I am no regexp expert, please ask again if this pertains to functional design or implementation choices. 4) empty strings are not allowed. The implementation does not have to use regular expressions either, just validate that the input is a non-empty series of characters and digits. 5) ?? I guess this is a question for @marcelzwiers

aardkronkel commented 4 years ago

See https://streamer-acc.dccn.nl for demo.

Concerning the 1 to 5 above:

I am closing this issue now.