Closed wehlutyk closed 6 years ago
The previous tensordot
implementation works with with n_ξ_samples = 1
and OOM's with n_ξ_samples = 2
. It turns out the same happens for the new whileloop
implementation, which might not be that much better then. Closing this as done still.
Correction about the time performance. On my laptop (no GPU) with a super simple 50 nodes network we get
whileloop → 113.64 it/s tensordot → 90.91 it/s
Second correction. Still on my laptop, with a 20 x 20 planted partition network:
On grunch:
So it is indeed a great optimisation gain when using larger n_ξ_samples :)
Implementation done in 48b378fcc552b98b8c943238ced352986a7e2484 and 3591b279b7fac8f93cdab563a416412bfc60f4f4.
In small tests there doesn't seem to be any speedup, but the memory consumed should be less: we just need to check that we can now run BlogCatalog on grunch with
n_ξ_samples = 5
without blowing up the memory.(tbd when I recover my home folder on grunch...)