epiverse-trace / finalsize

R package to calculate the final size of an SIR epidemic in populations with heterogeneity in social contacts and disease susceptibility
https://epiverse-trace.github.io/finalsize/
Other
11 stars 7 forks source link

Simplify and improve performance of iterative solver #63

Closed Bisaloo closed 2 years ago

Bisaloo commented 2 years ago

This PR gathers some minor simplifications and performance improvements to the iterative solver.

@pratikunterwegs, as the most knowledgeable person about this code, I'm letting you choose who should be the 2nd reviewer besides yourself :slightly_smiling_face:.

I hope I didn't break anything :crossed_fingers:

pratikunterwegs commented 2 years ago

Thanks for looking into this @Bisaloo - I'll ask @TimTaylor as he's added comments in the past. You're right to identify that the code can probably be optimised, as it's really heavily mirroring the C++ version. The tests aren't quite passing but I'll take a look in a bit and check whether I can figure it out

TimTaylor commented 2 years ago

give me a poke when the code is working again and I'll take a pass over

Bisaloo commented 2 years ago

I fixed the obvious error. There is another test failing because of matrix sizes:

Error in contact_matrix * demography_vector %o% susceptibility: non-conformable arrays

But I believe it would be more robust to address this in epi_spread. I'm not entirely sure Kronecker and recast to matrix was behaving as expect here. What do you think?

pratikunterwegs commented 2 years ago

Error in contact_matrix * demography_vector %o% susceptibility: non-conformable arrays

It's the %o% product which produces an array and causes this issue.

I'm not entirely sure Kronecker and recast to matrix was behaving as expect here. What do you think?

if you could essentially cbind these arrays together, and then multiply element for element with the expanded contact_matrix, that's what we want, if I understand correctly.

I'm fine with moving this to epi_spread.R as well.

Bisaloo commented 2 years ago

While looking into it more precisely, the size issue comes from the fact that in this specific test, susceptibility is a column vector instead of a standard vector.

https://github.com/epiverse-trace/finalsize/blob/9020b5affbbf256fb02649936b40e3ea0e8c7c44/tests/testthat/test-iterative_solver_vary_r0.R#L194

Is that really something we want to allow?

TimTaylor commented 2 years ago

EDIT: comments overlapped with Hugo's above whilst posting. No view on what's correct or not but the issue can be seen in the dummy data by setting s to a matrix (as in the code)

x <- matrix(runif(100), nrow = 10)
s <- runif(10)   # change this to match code
s <- matrix(s, ncol = length(s)
d <- runif(10)

I guess that the need for 1d matrix could actually be removed and would speedup the code a little too. Can take a further look when I review.

pratikunterwegs commented 2 years ago

While looking into it more precisely, the size issue comes from the fact that in this specific test, susceptibility is a column vector instead of a standard vector.

https://github.com/epiverse-trace/finalsize/blob/9020b5affbbf256fb02649936b40e3ea0e8c7c44/tests/testthat/test-iterative_solver_vary_r0.R#L194

Is that really something we want to allow?

susceptibility needs to be a matrix when there are multiple risk groups, so it should be possible to pass a one-column matrix.

The error here was actually that epi_spread was not being called before passing the spread data to the solver in this one test (r0 = 12). So susceptibility was not being converted into a vector (which epi_spread does). That's my mistake, I'll suggest that as an edit, should fix the problem.

Bisaloo commented 2 years ago

Maybe one last comment, but I don't feel super strongly:

The initial test failures stem from the fact that I find the zeros var somewhat counterintuitive. I think it would make more sense for zeros to really be 0 when you want the finalsize of this group to be 0. This would prevent us from having to use (1 - zeros) each time. I can understand the current approach if you think of 0 and 1 as FALSE and TRUE though but it would not be my first intuition :shrug:

pratikunterwegs commented 2 years ago

Maybe one last comment, but I don't feel super strongly:

The initial test failures stem from the fact that I find the zeros var somewhat counterintuitive. I think it would make more sense for zeros to really be 0 when you want the finalsize of this group to be 0. This would prevent us from having to use (1 - zeros) each time. I can understand the current approach if you think of 0 and 1 as FALSE and TRUE though but it would not be my first intuition 🤷

I'm not tied to this, and honestly not clear what the original intent was - happy for it to be changed.

TimTaylor commented 2 years ago

I've not looked at the outer v kronecker code (https://github.com/epiverse-trace/finalsize/commit/21afdf47b25389f8cccd8113362fc2fa5ee25a71) but the rest looks ok and the tests pass for me locally so will approve (may be worth a separate issue to assure yourselves on this element of the code?). A couple of stylistic points but can raise them separately as well.

pratikunterwegs commented 2 years ago

Merged this after checks passed, thanks again @Bisaloo