Closed ericphanson closed 2 years ago
just to recall from the docs
load(signal[, span]; encoded::Bool=false)
If span is present, return load(...)[:, span]
julia> load(signal)[:, signal.span] # error
you're indexing relative to the Samples
object itself, so this expression should always induce an error if start(signal.span) > Nanosecond(0)
. in other words, I think you might've wanted TimeSpan(0, duration(signal.span))
instead of signal.span
julia> load(signal, signal.span) # loses 1 second, no error Samples (00:09:59.000000000):
This should probably throw an error. not able to check right now but I bet the implementation is grabbing whatever data is available up until the stop
of the span.
IIRC the underlying API method used here does this explicitly/purposefully, since it tries to follow the familiar Base.read(::IOStream, nb::Integer)
behavior (e.g. "read at most nb
bytes")
Ok, I think if the load
errored the semantics would be more clear.
It seems like what a timespan is relative to changes a lot:
Signal
, start
is relative to the start timestamp of the associated Recording
(and signal.span
is then also relative to the recording)load
ing a Signal
, the span needs to be relative to the start(signal.span)
Samples
, its relative to start(span)
for whatever span
was load
ed.I think the discrepancy between (1) and (2) is what confused me here, but also (2) vs (3) can be confusing as well.
(1) vs (2) comes up if you are trying to use the same “wall clock” timespan for two signals from the same recording; you need to do a manual shift to get the same timespan.
(2) vs (3) I’ve basically got a handle on at this point (it’s a bit easier to understand I think since Samples don’t have any concept of a start position, so it has to work like that), but I think it’s still confusing when onboarding onto Onda and often requires manual shifts.
it’s a bit easier to understand I think since Samples don’t have any concept of a start position, so it has to work like that
yup, this is the concept to remember. Samples
instances are "context agnostic."
Onda actually started out the other way (Samples
were coupled with their origin context), and IMO things got much nicer when we changed to the current approach
when loading a Signal, the span needs to be relative to the start(signal.span)
The easier mental path might be: "Samples are context agnostic, so getindex(x::Samples, :, ::TimeSpan)
is of course relative to x
itself, and then load(signal, span)
is just an optimized fast path alternative to load(signal)[:, span]
in certain situations."
(given that it's supposed to be an easily swappable fast-path alternative, it'd feel weird IMO for span
behave differently in the two different expressions of the same operation)
but I think it’s still confusing when onboarding onto Onda and often requires manual shifts.
If helpful, we can add more content to examples/tour.jl
if it's not already covered there
(1) vs (2) comes up if you are trying to use the same “wall clock” timespan for two signals from the same recording; you need to do a manual shift to get the same timespan.
At least this is explicitly covered in the examples:
Onda actually started out the other way (Samples were coupled with their origin context), and IMO things got much nicer when we changed to the current approach
Ok, good to know that this was already tried!
I can see how the context-agnosticness can be useful for writing generic code.
The easier mental path might be: "Samples are context agnostic, so getindex(x::Samples, :, ::TimeSpan) is of course relative to x itself, and then load(signal, span) is just an optimized fast path alternative to load(signal)[:, span] in certain situations."
I think this makes sense; so instead of seeing it as 3 choices of relativeness, there's only 2 because load
should be the same as indexing.
I think it is still kind of confusing that load(signal, signal.span)
is different from load(signal)
. I get that follows from the "load == indexing" principle, but I still just find it a confusing API. It really just seems like if span
is omitted, then a natural default is the signal's span, but it's actually TimeSpan(0, duration(signal.span))
instead.
Was it like this in Onda v0.11 too? There it seems even more confusing since you are passing a recording UUID to load
: https://beacon-biosignals.github.io/Onda.jl/v0.11/#Onda.load, so I'd expect the span
to be relative to the recording's start time, not the samples object being pulled. Maybe you couldn't have signals with different start times relative to the recording then?
I think it is still kind of confusing that load(signal, signal.span) is different from load(signal). I get that follows from the "load == indexing" principle, but I still just find it a confusing API. It really just seems like if span is omitted, then a natural default is the signal's span, but it's actually TimeSpan(0, duration(signal.span)) instead.
Since the format allows signals to have their own spans relative to their recording, we will deal with "relativity" in the API no matter what, so we'll end up picking a consistent/well-documented convention and sticking with it regardless. Let's say we picked the other possible convention here - then I would make the claim that it feels confusing that load
's span
isn't the same as getindex
's span
, and that I'd expect it to not require offsetting to switch between the two equivalent segment loading paths (load(signal, span)
vs. load(signal)[:, span]
).
More philosophically, since load
is operating at a per-signal level of the Samples
API (it doesn't have any knowledge about recordings or cross-signal operations), it feels more consistent to me keep it recording-agnostic and "internal" to the provided signal. Let the caller do recording-aware/cross-signal transforms, it'd be weird to try to "do that for them" in this one case
One thing we technically could do to make our conventional choice more configurable to callers is to either provide a kwarg like relative=true
or even a RelativeTimeSpan
struct or something. But I feel like it's actually more desirable/readable/minimal for the caller to just do whatever transform they want to do on the span
they provide
Was it like this in Onda v0.11 too? There it seems even more confusing since you are passing a recording UUID to load: https://beacon-biosignals.github.io/Onda.jl/v0.11/#Onda.load, so I'd expect the span to be relative to the recording's start time, not the samples object being pulled. Maybe you couldn't have signals with different start times relative to the recording then?
ngl, I can't recall 😁
Yeah, that makes sense. Agreed it makes sense for the caller to do it, I think the issue is just that they might not realize they need to do it.
Hm, as a small tweak, what if we renamed the positional argument from span
to signal_subspan
or something like that? To indicate it is a relative subspan of the signal, not a span within the recording.
Discussed with @kimlaberinto and @hannahilea and we are kind of confused:
Is this how it is supposed to work? It doesn't seem right.