Closed Bisaloo closed 2 years ago
Thanks for looking into this @Bisaloo - I'll ask @TimTaylor as he's added comments in the past. You're right to identify that the code can probably be optimised, as it's really heavily mirroring the C++ version. The tests aren't quite passing but I'll take a look in a bit and check whether I can figure it out
give me a poke when the code is working again and I'll take a pass over
I fixed the obvious error. There is another test failing because of matrix sizes:
Error in
contact_matrix * demography_vector %o% susceptibility
: non-conformable arrays
But I believe it would be more robust to address this in epi_spread
. I'm not entirely sure Kronecker and recast to matrix was behaving as expect here. What do you think?
Error in contact_matrix * demography_vector %o% susceptibility: non-conformable arrays
It's the %o%
product which produces an array and causes this issue.
I'm not entirely sure Kronecker and recast to matrix was behaving as expect here. What do you think?
if you could essentially cbind
these arrays together, and then multiply element for element with the expanded contact_matrix
, that's what we want, if I understand correctly.
I'm fine with moving this to epi_spread.R
as well.
While looking into it more precisely, the size issue comes from the fact that in this specific test, susceptibility
is a column vector instead of a standard vector.
Is that really something we want to allow?
EDIT: comments overlapped with Hugo's above whilst posting. No view on what's correct or not but the issue can be seen in the dummy data by setting s to a matrix (as in the code)
x <- matrix(runif(100), nrow = 10)
s <- runif(10) # change this to match code
s <- matrix(s, ncol = length(s)
d <- runif(10)
I guess that the need for 1d matrix could actually be removed and would speedup the code a little too. Can take a further look when I review.
While looking into it more precisely, the size issue comes from the fact that in this specific test,
susceptibility
is a column vector instead of a standard vector.Is that really something we want to allow?
susceptibility
needs to be a matrix when there are multiple risk groups, so it should be possible to pass a one-column matrix.
The error here was actually that epi_spread
was not being called before passing the spread data to the solver in this one test (r0 = 12
). So susceptibility
was not being converted into a vector (which epi_spread
does). That's my mistake, I'll suggest that as an edit, should fix the problem.
Maybe one last comment, but I don't feel super strongly:
The initial test failures stem from the fact that I find the zeros
var somewhat counterintuitive. I think it would make more sense for zeros
to really be 0 when you want the finalsize of this group to be 0. This would prevent us from having to use (1 - zeros)
each time.
I can understand the current approach if you think of 0
and 1
as FALSE
and TRUE
though but it would not be my first intuition :shrug:
Maybe one last comment, but I don't feel super strongly:
The initial test failures stem from the fact that I find the
zeros
var somewhat counterintuitive. I think it would make more sense forzeros
to really be 0 when you want the finalsize of this group to be 0. This would prevent us from having to use(1 - zeros)
each time. I can understand the current approach if you think of0
and1
asFALSE
andTRUE
though but it would not be my first intuition 🤷
I'm not tied to this, and honestly not clear what the original intent was - happy for it to be changed.
I've not looked at the outer v kronecker code (https://github.com/epiverse-trace/finalsize/commit/21afdf47b25389f8cccd8113362fc2fa5ee25a71) but the rest looks ok and the tests pass for me locally so will approve (may be worth a separate issue to assure yourselves on this element of the code?). A couple of stylistic points but can raise them separately as well.
Merged this after checks passed, thanks again @Bisaloo
This PR gathers some minor simplifications and performance improvements to the iterative solver.
21afdf47b25389f8cccd8113362fc2fa5ee25a71: I use the outer product instead of the Kronecker product. This removes the need to cast as a vector and then recast a matrix, with the associated risks of filling by rows instead of cols, etc. It is also much more efficient:
Created on 2022-10-05 by the reprex package (v2.0.1.9000)
60786a152a62fbb6319e4a5fd4971f8fc288b4a1: I think this a breadcrumb for a previous version of the code but a copy doesn't seem necessary, neither for technical or readability reasons. Correct?
6113d981b1e547bf4c1b3463e3ef475ae83ef3de: Since arithmetic operations on matrices and vectors are usually highly optimized at the hardware level, performance is usually much better than with
ifelse()
:Created on 2022-10-05 by the reprex package (v2.0.1.9000)
f86ec30f9365703b83f59e494e5879438491aa95: I am kind of unsure of what is the correct thing here. It seems strange that there was an unused argument. I simply removed it but it might be the sign that there is actually a bug somewhere in the code.
77c516f7819cadd4d196f12f8b12c9d71fcbebb8: Mostly for readability, to make it clear that we're setting to 0 the same elements on rows and cols.
@pratikunterwegs, as the most knowledgeable person about this code, I'm letting you choose who should be the 2nd reviewer besides yourself :slightly_smiling_face:.
I hope I didn't break anything :crossed_fingers: