borisblagov / Julia_AR4_Bayesian_Regression

0 stars 0 forks source link

Memory efficiency (2) #2

Closed gdalle closed 1 year ago

gdalle commented 1 year ago

https://github.com/borisblagov/Julia_AR4_Bayesian_Regression/blob/9e5aa1749f4b0afb59a645c1f0696e85935e1d04/src/NewB.jl#L32-L32

You already got some suggestions about this on Discourse. I'm not sure this is a bottleneck cause I haven't run your code, but if you wanted to optimize this to the brink, you could do something like this with an in-place matrix-vector product:

resid = similar(Y)  # the only allocation
mul!(resid, X, beta_d)
resid .-= Y

Note how I used dot syntax to avoid the allocations that would happen if instead I had done resid = resid - Y, resid .= resid - Y or resid = resid .- Y. The only way to not allocate is to do resid .= resid .- Y, which is the long form for resid .-= Y.

borisblagov commented 1 year ago

Implemented the suggestion, current code

   resid = similar(Y)          # changes 
    mul!(resid, X, beta_d)
    resid .= resid.- Y          # short form is resid .-= Y

This is some high level thing...

julia> @btime genSigma($Y,$X,$beta_d,$nu0,$d0);
  607.955 ns (2 allocations: 1.28 KiB)

julia> @btime genSigma_old($Y,$X,$beta_d,$nu0,$d0);
  609.249 ns (3 allocations: 2.50 KiB)

This beats the dot product suggestion by halving the allocations. For the whole program it brings the allocations down by 10 percent. In discourse I had the program using 51MB in allocations versus now 43MB. I am using @time here because I used @time there as well

julia> @time include("MainNewB.jl")
  0.042049 seconds (143.69 k allocations: 43.318 MiB, 4.79% gc time)

Edit: For the life of me, I don't understand the decision of the devs to add the += syntax. I really don't see how one would consider x = x + 3 so much longer than x += 3 and the += operator just makes the code harder to read. And you need to know the specific synthax.... 🥲

gdalle commented 1 year ago

This is some high level thing...

As I said, here it isn't a bottleneck on the CPU time but it's good practice anyway :)

gdalle commented 1 year ago

Edit: For the life of me, I don't understand the decision of the devs to add the += syntax. I really don't see how one would consider x = x + 3 so much longer than x += 3 and the += operator just makes the code harder to read. And you need to know the specific synthax.... 🥲

It exists in Python so I'm not very surprised. The only real problem I have with this command is that it's easy to forget it allocates by default, by creating x + y before pointing x to it