Type stability and performance

itsdfish commented 1 year ago

Hi all,

While I was trying to familiarize myself with your code, I noticed a lot of type unstable types throughout the package. This made me wonder whether there are a lot of easy performance gains. I started fixing some of the type unstable code, but decided to stop because I am not familiar with code and design. For example, I noticed that making Parameters type stable broke some tests. So it's unclear to me whether it can be made type stable, or what the best design approach might be. My suspicion is that there might be an opportunity for significant performance improvements. Is this an issue that has been considered?

mloubout commented 1 year ago

This is one of the low hanging fruits yes but we haven't had much time to look into it. I think the current performance also was acceptable at least for us and not sure how much would be gained to justify spending time on it.

Would you have a partial minimal example that shows for one case of it's worth the effort? We will definitely try to work on it as we want to make sure the community gets what it needs out of this package, just want to check if we should make it a priority.

itsdfish commented 1 year ago

Your suggestion to make a minimal example is a good idea. I agree that we should determine the performance gain before making major changes. Given that I am not very familiar with your package and only have basic knowledge of neural networks, I was wondering if you could advise me on two points to help me get started?

First, what would you consider to be a good test case to benchmark? Perhaps there is code in the tests that would make a good basis for a benchmark. Perhaps test_conditional_glow_network.jl.

Second, can you tell me if the types in Parameter must change once a the fields are initialized? In the code, there are several nested structs. So one approach is to fix the lower levels and work up through the hierarchy.

itsdfish commented 1 year ago

My first pass with ConditionalLayerGlow was unsuccessful. In my experience, fixing these types of issues gives 1.5X to 3X speed up. But I may have missed something.

To the best of my knowledge, I made all structs associated with ConditionalLayerGlow type stable except Parameters. One problem with Parameters was a method error for iterator. I'm not sure how to fix that, but I also noticed that there is a conversion function that can change the type of the data and grad fields. I'm not sure if this was the reason for the lack of improvement. The changes I made can be found here.

My benchmark was based on code I found in tests.

using InvertibleNetworks, LinearAlgebra, Test, Random
using BenchmarkTools
# Random seed
Random.seed!(11)

###################################################################################################
# Test invertibility

# Input
nx = 24
ny = 24
k = 4
n_cond = k
n_hidden = 4
batchsize = 2

# Input images
X = randn(Float32, nx, ny, k, batchsize)
Cond = randn(Float32, nx, ny, k, batchsize)
X0 = randn(Float32, nx, ny, k, batchsize)
dX = X - X0

# 1x1 convolution and residual blocks
C = Conv1x1(k)
RB = ResidualBlock(Int(k/2)+n_cond, n_hidden; n_out=k, k1=3, k2=3, p1=1, p2=1, fan=true)
L = ConditionalLayerGlow(C, RB; logdet=true)
@btime L.forward($X, $Cond);
# original 337.165 μs (426 allocations: 1.46 MiB)
# with changes  338.217 μs (427 allocations: 1.46 MiB)

Y = L.forward(X, Cond)[1]
Y_, logdet = L.forward(X, Cond)
ΔY = ∇mse(Y_, Y)
@btime L.backward($ΔY, $Y_, $Cond)[1]
# origin  1.499 ms (1418 allocations: 4.83 MiB)
# with changes 1.472 ms (1420 allocations: 4.83 MiB)

slimgroup / InvertibleNetworks.jl

Type stability and performance #85