Open eric-wieser opened 4 years ago
what are the performance benefits?
20% less time spent on multiplication, 30% less time on addition. Possibly a little less, since the benchmark in that PR folds in another change too.
But remember, the question isn't whether we want these benefits- it's whether we want user (or clifford.tools) code to automatically get the same benefit, or if we want to force that code to opt in by making this change everywhere:
-layout.MultiVector(my_optimized_function(...))
+layout.MultiVector(my_optimized_function(...), copy=False)
right. but the motivation behind this change is performance, correct?
That's one way of looking at it, We can get the performance boost either by:
copy=False
on 75 lines of our code, and telling users to do the same to their code. If they don't listen, all that happens is that user code is slower.My hunch is that the number of users using layout.MultiVector(some_array)
is approximately zero anyway, so the only point that matters is that choosing 1 over 2 saves us some code.
right. users who are constructing MV's manually like this probably care about performance, whould you agree? in this case i think breaking things is ok, and putting a warning in docs is good.
Yeah I agree, by far the main reason to be manually constructing multivectors in this way is for performance and in that case making the copy = false
default makes sense
Here are some reasons why I (and hence maybe users) build multivectors manually:
I don't know if any of those scenarios are likely to cause trouble
Having thought about this, I think the copy=True default makes more sense. I think it is important to remember our position in the GA software ecosystem and goals of the project. In most of the places I have seen Clifford in use our users are students (eg. in Brno) or mathematicians/applied scientists who are getting started with GA and want everything to just work and want to play around with a few algebras etc, not write super performant code (at first anyway). If we change it to not copy by default we introduce a higher probability of writing bugs, especially for inexperienced users.
Of course we would in general love to have fast code. Users who want higher performance for some bits of their code can change to using copy=False for those sections. In doing this they make a conscious decision and, hopefully, will be aware of the safety tradeoff inherent in their choice.
So overall on reflection I think we should just add the sprinklings of copy=False into the tools, its not a big pain for us at all, but a new user getting behaviour they don't expect is definitely something that would sit badly with them.
Another possibility to consider: remove the default; that is, force all callers to make a conscious decision by saying copy=True or copy=False.
Benefits:
Almost all of our operations create an extra copy (see #280). The fix proposed there is to opt-in for copy-free construction (spelt
layout.MultiVector(value, copy=False)
). However, it seems that every single construction withinclifford
would want this optin, which would make it seem better as a default.If we made not copying the default, there are two types of code that would start doing the wrong thing:
Code which modifies arrays to create multiple multivectors
Code which modifies multivectors sharing the same array
Both of these are easy to fix by using
buffer.copy()
instead ofbuffer
, but:Does the benefit of not having
copy=False
sprinked throughout tools justify the cost of the potential breakage here?