Closed MilesCranmer closed 1 year ago
main | c1bdf62e6aa449... | t[main]/t[c1bdf62e6aa449...] | |
---|---|---|---|
Quantity/creation/Quantity(x) | 3.2 ± 0.1 ns | 3.2 ± 0.1 ns | 1 |
Quantity/creation/Quantity(x, length=y) | 3.3 ± 0.1 ns | 3.7 ± 0.1 ns | 0.892 |
Quantity/with_numbers/*real | 2.9 ± 0.1 ns | 3.2 ± 0.1 ns | 0.906 |
Quantity/with_numbers/^int | 21 ± 1.7 ns | 10.1 ± 2.5 ns | 2.08 |
Quantity/with_numbers/^int * real | 21.1 ± 1.5 ns | 10.5 ± 2.4 ns | 2.01 |
Quantity/with_quantity/+y | 6.7 ± 0.3 ns | 6 ± 0.3 ns | 1.12 |
Quantity/with_quantity//y | 3.3 ± 0.1 ns | 3.7 ± 0.1 ns | 0.892 |
Quantity/with_self/dimension | 1.6 ± 0.1 ns | 1.6 ± 0.1 ns | 1 |
Quantity/with_self/inv | 3.3 ± 0.1 ns | 3.7 ± 0.1 ns | 0.892 |
Quantity/with_self/ustrip | 1.6 ± 0.1 ns | 1.6 ± 0.1 ns | 1 |
time_to_load | 0.136 ± 0.0012 s | 0.165 ± 0.0014 s | 0.822 |
A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR. Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).
It seems like the remaining performance issue was a type inference problem. The change:
+ return QuantityArray(output_array, dimension(first_output)::dim_type(ElType), ElType)
- return QuantityArray(output_array, dimension(first_output), ElType)
seems to have fixed it.
So it seems to be doing better than an array of quantities now. But not as good as a naked array.
@gaurav-arya I remember we discussed this at some point. Curious to hear if you have any suggestions.
julia> f(x) = @. (x^2 * 0.9 + 0.9 * x / 0.1 * x)
f (generic function with 1 method)
Naked array:
julia> @btime f(x) setup=(x = randn(100000));
39.416 μs (4 allocations: 781.31 KiB)
Array of quantities:
julia> @btime f(x) setup=(x = randn(100000) .* u"km/s");
328.583 μs (4 allocations: 3.81 MiB)
Quantity array:
julia> @btime f(x) setup=(x = QuantityArray(randn(100000), u"km/s"));
134.000 μs (38 allocations: 783.48 KiB)
It's strange that there are so many allocations, even though it's about the same size as the naked array. But it is great that the compiler seems to be skipping some of the dimensionality checks!
Still no speedup on this broadcasting test:
f(x) = x^4 * 0.9 - x * x / 0.3 * x * 0.9 * x
However there is definitely a speedup on f(x) = x^4
relative to an array of quantities. So maybe we just leave it at that? I guess some broadcasted functions might be too complex for the compiler inlining to eliminate the dimension check.
Edit: Fixed this issue discussed in this comment by propagating ndim
, size
, axes
, keys
, getindex
, and broadcastable
for AbstractQuantity
to the value. Seems like anything that wants to be involved in broadcasting needs to define those.
So maybe we just leave it at that? I guess some broadcasted functions might be too complex for the compiler inlining to eliminate the dimension check.
Agree about leaving further performance hacking to the future:)
Cool. So once #37 merges, I think all that's left is writing the unittests to get coverage back to 100%.
Seems like a bunch of the array tests pass on 1.10 but don't on 1.9... Maybe something in the broadcasting interface changed.
Woo! 100% testing coverage again!
I'll do another code review and then we can merge.
Thanks again for your help @gaurav-arya!!
This creates a type:
that stores a single set of physical dimensions for the entire array.
For example:
it implements a custom broadcasting interface so as to create a new
QuantityArray
with the correct output dimensions. This part is a bit involved as it needs to materialize the first element. Thanks to tictaccat on Discourse for help.Everything is type stable.
For safety purposes, the compiler is in charge of figuring out when the dimension calculation can be skipped for each element or not. It should be able to if it inlines things correctly, which happens for simple broadcasting. For more complex broadcasting it might not be able to inline things.
@ChrisRackauckas do you think you or someone else in @SciML could review this to see if it fits your requirements?