Closed gdalle closed 1 year ago
Implemented the suggestion, current code
resid = similar(Y) # changes
mul!(resid, X, beta_d)
resid .= resid.- Y # short form is resid .-= Y
This is some high level thing...
julia> @btime genSigma($Y,$X,$beta_d,$nu0,$d0);
607.955 ns (2 allocations: 1.28 KiB)
julia> @btime genSigma_old($Y,$X,$beta_d,$nu0,$d0);
609.249 ns (3 allocations: 2.50 KiB)
This beats the dot product suggestion by halving the allocations. For the whole program it brings the allocations down by 10 percent. In discourse I had the program using 51MB in allocations versus now 43MB. I am using @time
here because I used @time
there as well
julia> @time include("MainNewB.jl")
0.042049 seconds (143.69 k allocations: 43.318 MiB, 4.79% gc time)
Edit: For the life of me, I don't understand the decision of the devs to add the +=
syntax. I really don't see how one would consider x = x + 3
so much longer than x += 3
and the +=
operator just makes the code harder to read. And you need to know the specific synthax.... 🥲
This is some high level thing...
As I said, here it isn't a bottleneck on the CPU time but it's good practice anyway :)
Edit: For the life of me, I don't understand the decision of the devs to add the += syntax. I really don't see how one would consider x = x + 3 so much longer than x += 3 and the += operator just makes the code harder to read. And you need to know the specific synthax.... 🥲
It exists in Python so I'm not very surprised. The only real problem I have with this command is that it's easy to forget it allocates by default, by creating x + y
before pointing x
to it
https://github.com/borisblagov/Julia_AR4_Bayesian_Regression/blob/9e5aa1749f4b0afb59a645c1f0696e85935e1d04/src/NewB.jl#L32-L32
You already got some suggestions about this on Discourse. I'm not sure this is a bottleneck cause I haven't run your code, but if you wanted to optimize this to the brink, you could do something like this with an in-place matrix-vector product:
Note how I used dot syntax to avoid the allocations that would happen if instead I had done
resid = resid - Y
,resid .= resid - Y
orresid = resid .- Y
. The only way to not allocate is to doresid .= resid .- Y
, which is the long form forresid .-= Y
.