Closed aaron-mcdaid-zalando closed 6 years ago
why we need that? how much memory you expect to save by this? (I mean kpi, features, variants are usually very small objects). Or did I misunderstood smth?
The data
object isn't small though, that is what I'm changing. I'm not making any change to kpi
, features
, variants
.
"We should not 'deepcopy' the pandas Dataframe. The dataframe is essentially immutable for the purposes of this. "We'll 'deepcopy' everything else though." - this is from your comment, why we need to deepcopy that?
we didn't have deepcopy of data before, or, we had that somewhere in expan service?
I think that the idea is quite good.
The only hypothetical objection would be that we're sticking to the idea of carrying around data frames, which I think breaks the separation of the pure logical description of a test and its actual execution.
@daryadedik, ExpanService.py
does a deepcopy of the StatisticalTestSuite
. Although, maybe we should simply remove that deepcopy and simplify things there, and therefore drop this PR. What do you think?
@aaron-mcdaid-zalando ah, we had a deepcopy in expan service, now I get your pull request :) I will think about it, but now we can keep it. Sorry for many comments.
I don't understand, why you even mention ExpanService here? You are developing self-sufficient, standalone library, focus on the API, which this lib should provide :)
@igusher , maybe this suggests that the boundary between expan and our internal Zalando tools is in the wrong place!
It is appropriate to define special methods, such as __repr__
and __len__
, in this repo. Therefore, __deepcopy__
is also suitable in this repo.
... but I'm not saying this in order to advocate again for this PR. I'm just observing that I'm confused about what design decisions were made when designing this API
@aaron-mcdaid-zalando it looks good to me. What is still in progress though?
In another product, we call
deepcopy
on lists ofStatisticalTest
. As a result, we might waste some memory copying dataframesMy understanding is that, within
StatisticalTest
, the dataframe is not changed. Therefore, it is not necessary to copy it. Is this correct? Is.data
"immutable" withinStatisticalTest