Closed deoxyribose closed 2 years ago
Hi @deoxyribose, thanks for sharing!
As you're noticing, there is still work to do on improving performance in the static DSL. I remember that currently the Map
combinator is somewhat inefficient, and usually allocates more memory than it needs to and requires dynamic type inference. Fixing this is a TODO on my mind; however, I would still have expected it to beat the dynamic DSL here.
One thought that may help: do things improve if you make pixel_kernel
static as well?
If not, if you share a profile output I can take a look at what's taking the time, and see if anything occurs to me to improve this.
This sounds like a really neat application of Gen! If you're not planning on constraining individual pixel values, but only the the entire image at one go, another way of writing screen_model_dynamic
would be to simultaneously sample all pixel values from a multivariate normal distribution, and then reshape the resulting vector. This might reduce the overhead associated with having a large trace structure with thousands of individual random choices, because instead you'll have a trace with just two random choices: one for the tree, and am vector valued random choice for the image.
Thanks for the answers @georgematheos and @ztangent.
Making pixel_kernel
static helps quite a bit:
@time trace = simulate(screen_model, (size(tmp),))
2.841029 seconds (6.95 M allocations: 1.809 GiB, 62.85% gc time)
Output from Profile.print(format=:tree)
here
and Profile.print(format=:flat, sortedby=:count)
here
Nice idea with sampling the whole image from a multivariate normal! I tried it like this:
@gen function screen_model_dynamic2(size::Tuple{Int64,Int64})
nrows, ncols = size
screenshot = Array{Float64}(undef, nrows, ncols)
tree = @trace(generate_segments(Rectangle(Pixel(1,1),Pixel(nrows,ncols))), :tree)
noise = @trace(gamma(1, 1), :noise)
is,js = get_index_matrices(screenshot)
mean_img = map((i,j) -> get_value_at(i,j,tree), is, js)
cov_mat = sparse(1:sizeof(mean_img),1:sizeof(mean_img),fill(noise, sizeof(mean_img)))
screenshot = @trace(mvnormal([mean_img...],cov_mat), :img)
return reshape(screenshot,size)
end
I was hoping to avoid memory issues by making the covmat sparse, but I guess mvnormal makes it dense, so I get an OutOfMemoryError. Maybe writing an mvnormal for scaled identity covmats could salvage the approach? Thanks again :)
Oops! Maybe try broadcasted_normal
, which does exactly what you asked! Not well documented unfortunately:
https://github.com/probcomp/Gen.jl/blob/master/src/modeling_library/distributions/normal.jl
Fantastic, that works great!
@time trace = simulate(screen_model_dynamic2, (size(tmp),))
0.282431 seconds (4.89 M allocations: 222.771 MiB, 11.58% gc time)
Thanks a lot!
Hello,
Thank you for writing this amazing PPL! I'm trying to implement a 2d version of the non-parametric change-point model from the Introduction to Modeling tutorial. The idea is to sample a tree structure that recursively divides an image into rectangles, and then sample pixel values for the whole image given the structure.
The tree generating function is fast enough:
However, the image sampling one is slow and uses a lot of memory:
Hoping to speed it up and also make inference more efficient down the line, I tried to use the Map combinator to denote the independence of the pixels given the structure, and use the static modeling language:
This turns out to require more time and memory, not less:
Profiling shows that a lot of time is spent on garbage collection and dynamic dispatch. I'm new to both Gen and Julia and I don't really know how to improve those. Perhaps it's related to issue #173? Hopefully I'm just doing something wrong :) Here's the full code.
Any help or comments are much appreciated!