EIT-ALIVE / eitprocessing

Software for electrical impedance tomography data loading, visualization and processing
https://eit-alive.github.io/eitprocessing/
Apache License 2.0
5 stars 1 forks source link

Add DataContainer superclass #275

Closed psomhorst closed 4 months ago

psomhorst commented 4 months ago

Closes #272.

In this implementation, the superclass is not very functional. It only inherits from Equivalence, removing the requirement therefore from the subclasses.

Ideally, the superclass would take define some common fields (attributes). However, any field (attribute) defined in the superclass would be first in de init function. label and name have default values in EITData, which would not work with non-default fields like path. Also, path would no longer be the first argument, potentially breaking existing code. To move fields toDataContainer, they would have to be the first arguments, or we would have to switch to keyword-only arguments.

This PR also removes superfluous inheritance from Sequence.

psomhorst commented 4 months ago

@DaniBodor This is a super simple PR. I'd like not to complicate this PR, so we can use this superclass for #269.

DaniBodor commented 4 months ago

Ideally, the superclass would take define some common fields (attributes). However, any field (attribute) defined in the superclass would be first in de init function. label and name have default values in EITData, which would not work with non-default fields like path. Also, path would no longer be the first argument, potentially breaking existing code. To move fields toDataContainer, they would have to be the first arguments, or we would have to switch to keyword-only arguments.

I agree with this, and I think this should be implemented. I also think that the first fields should be identical for each container, and then the specific arguments per container should follow after that. Also, I think that nomenclature should be aligned more. For example, vendor and pixel_impedance in EITData in my mind is the equivalent of category and values (respectively) in the other containers.

Regarding breaking existing code, this is an argument that you have made a few times now (e.g. in issue #262). I don't think we have a large enough user base at this point to make that a prohibitive reason not to change things. We are very much still in the developmental phase of this project, and it's more important to "get things right", even if that means breaking changes, than ensuring backwards compatibility.

I think these kinds of changes can be implemented in a separate PR though, so that this one can stay light, as you suggest.

EDIT: I will copy these points to a new issue (#278), so that we can keep track more easily once this PR is closed. Let's try to discuss further there.

psomhorst commented 4 months ago

I agree with this, and I think this should be implemented. I also think that the first fields should be identical for each container, and then the specific arguments per container should follow after that.

I created an issue for this. We can discuss how to approach this in #279.

Also, I think that nomenclature should be aligned more. For example, vendor and pixel_impedance in EITData in my mind is the equivalent of category and values (respectively) in the other containers.

category is going to point to what type of data is stored in an object (e.g. impedance, pressure or breaths), so algorithms can check whether they support this type of data. I'm working on a proposal for a proper implementation of categories. A vendor is not a data type, and should not be a category.

We decided to explicitly make EITData.pixel_impedance different from values because it is not immediately apparent what the values are you get from EITData.values. Is it pixel impedance, or functional impedance, or...? I think one advantage of data-specific classes is that you can customize things like this.

Regarding breaking existing code, this is an argument that you have made a few times now (e.g. in issue #262). I don't think we have a large enough user base at this point to make that a prohibitive reason not to change things. We are very much still in the developmental phase of this project, and it's more important to "get things right", even if that means breaking changes, than ensuring backwards compatibility.

I agree the user base is not large, but it is annoying enough for a few people here working on multiple projects at the same time. I think we should at least be more clear about what breaks and how to fix it in release notes, and minimise the number of times things break.