Open kecnry opened 2 years ago
@kecnry I'm taking up this issue and was wondering if you had further opinions since your original summary. I'm not familiar with how most people use specreduce, but I think it makes sense to rely on methods on Trace
rather than expecting users to check/set attributes directly.
I think that's probably fine myself. FlatTrace
actually has a set_position
method that achieves the same thing. In that case, just making the internal trace
and trace_pos
attributes as readonly properties should avoid the point of confusion, I think.
@kecnry Is there any reason to stick with using dataclasses
in tracing.py
? I've already spent a few hours working on making immutable attributes on a dataclass, but it's all pretty miserably convoluted. I think we/I would save a lot of effort by just writing the class out fully. Any opinions?
For broader discussion of this problem, see this blog post and its never-ending discussion in the comments.
After some discussion in #147, it has become clear that this is a more general "problem" than just FlatTrace
, so I'm copying some thoughts from that thread here so that we can have a discussion in context of other specreduce operations (tracing, background subtraction, extraction, etc) and copying @eteq @ojustino for their thoughts.
In general, do we want to support the user setting any attributes across any of the classes (background and extraction included) and have the internal information in those objects update accordingly? Currently in most or maybe all cases, the user technically can change those attributes after initializing the object, but that does not update the internal arrays or results, which causes confusion. We can either make all of those attributes read-only properties that can only be set when initializing or we need to explicitly define the behavior for setting (or disable setting) for each of the attributes (in which case we might want to consider migrating away from dataclass in favor of just defining properties and setters).
I agree that this should be addressed package-wide and not just in the tracing operations. At first glance, I don't see a problem with our operations' current attributes being read-only after initialization. It might be good to leave space for exceptions in case development takes us in another direction.
Is the discussion about the necessity of dataclass
still on the table? My impression is that the draw of dataclass
is greater brevity, but many of our __post_init__()
methods are long enough that I think the class definitions would look complicated to an outsider regardless. Our operations also don't currently carry too many attributes, so a getter/setter approach may not be much more unwieldy. Either way, I'm for whichever approach is most intuitive.
For example, shifting edits the internal trace, but not
trace_pos
, but changes totrace_pos
andtrace
do not affect each other:Updating the trace_pos during a shift (or addition) can be done easily by overloading
Trace.shift
. Do we also want to implement a setter on thetrace_pos
attribute to updatetrace.trace
? What about vice-versa? Or do we maketrace.trace
read-only and require any edits to be by settingtrace_pos
or shifting?