pulselabteam / PulseDB

Other
40 stars 11 forks source link

Calibration-free performance gap: #subjects #9

Closed JoAllg closed 1 year ago

JoAllg commented 1 year ago

The reasons for the performance gap in the calibration-free approach do make sense to me. However, from a computer scientist's perspective, the biggest reason is simply the low number of subjects. Having a low number of different datapoints causes the deep learning model to overfit. This is what is happening with the calibration-based testing set.

Having many sequences per subject is helpful. However, a few sequences per subject should be enough. They would be especially useful if those sequences are taken in different activity states and therefore have a strong blood pressure variability [Mukkamala et al. “Step 2” at page 20 and Figure 8].

Referring to Figure 2 in your paper, where you compare #usable subjects to #sequences per subject in MIMIC-III and VitalDB: Because of this reasoning, I would rather have 3500 subjects times 200 sequences than 2700 subjects times 400 sequences, even if the resulting number of sequences is smaller. The calibration-free performance for different numbers of subjects, given your approach to extract them, could be a valuable experiment.

Since I have not seen other papers reviewing and bothering about the calibration-based/-free difference, I thought it would be great to hear your thoughts about this.

WeinanWang-RU commented 1 year ago

There has been several studies with similar topics coming out this year: https://doi.org/10.1109/JSEN.2023.3272921 https://doi.org/10.1038/s41597-023-02020-6