pyxem / orix

Analysing crystal orientations and symmetry in Python
https://orix.readthedocs.io
GNU General Public License v3.0
78 stars 45 forks source link

Performance: Speed up of deepcopy for Miller #461

Open CSSFrancis opened 8 months ago

CSSFrancis commented 8 months ago

Description of the change

Speeds up deep copy for Miller class. Current deep copy is slowed down by using the deepcopy function.

This causes issues in diffsims when we have to repeatedly copy and rotate some ReciprocalLatticeVector

For reviewers

hakonanes commented 8 months ago

Thank you for looking into making deepcopying faster.

Your solution only deepcopies the vectors and not the phase, which is not in line with the current behavior. The phase should not be shared. If deepcopying the phase is a bottleneck we should look into improving the speed of this method instead.

hakonanes commented 8 months ago

I realize now that copying only the data is what you want in the case you describe... Are you happy with a parameter deepcopy(data_only=False)? If true, your current solution is used. It is false by default, retaining the current behavior.

CSSFrancis commented 8 months ago

@hakonanes let me look into this more. I think deepcopying only the data is a good solution but I'll look into the phase deepcopy as well.

hakonanes commented 8 months ago

A sidenote: I've considered replacing deepcopy() with just copy() (and deprecating deepcopy() one minor release). The behavior is the same. But the naming is in line with NumPy's interpretation of a "copy" (not shared memory) and a "view" (shared memory). What do you think?

harripj commented 7 months ago

@CSSFrancis @hakonanes thanks for looking into this!

I think the deepcopy API should return a deep copy of all relevant data in this case. Whilst this PR may lead to speedups, I think this could be a footgun for the majority of users who would not expect any memory to be shared when using the API.

Are there other examples in other codebases where deepcopy returns partial copies?

If the speed ups are useful in diffsims then I think this PR should be an internal optimisation rather than in orix, such that the copying scope is well-defined and controlled, ie. it is known that phase is shared memory.

hakonanes commented 7 months ago

Thank you for your input, @harripj. I agree with you on all points. The deepcopy method is meant to be used by end users or in the setup of more demanding methods, not repeatedly inside a loop or similar.

But if we can get any speed up that would be nice.