beacon-biosignals / OndaBatches.jl

Local and distributed batch loading for Onda datasets
MIT License
2 stars 0 forks source link

Support sparse/overlapping labels #12

Open ericphanson opened 1 year ago

ericphanson commented 1 year ago

We have tables of labels, which could be sparse throughout a recording, and could be overlapping. That makes them a bad fit for LabeledSignal's. Is there another way to use OndaBatches? What needs to happen here to support that?

kleinschmidt commented 1 year ago

I think there are two (possibly related) cases we may want to support here:

if the sparse label starts/stops are aligned with some reasonable sampling rate (edit: and there are no overlaps), we could already support the first one (if we move some internal code into this package). However, in using that code, we're running into trouble with a situation where some spans start/stops are not aligned with the label sampling rate (e.g., if we assuming labels are sampled at 1Hz then if we have spans like 1.4-2.6s, we're in trouble). So in that situation, we have to decide how to convert that into a 1Hz signal. Up to this point we've been able to get by by "snapping" the start/stops to sample times which allows us to use the standard mechanism implemetned here, but an alternative would be to do something like create an all-zero signal and index into it with an AlignedSpan for each label. Dealing with overlaps would be annoying though, unless you do it with "soft labels" (adding up the votes associated with the one-hot encoding of hard labels or just using soft labels directly).