Closed Martin-Rey closed 4 years ago
@Martin-Rey I think I have fixed this in PR #128. Please confirm.
For information, there is also a bug in your test. You cannot take for granted that latest_ids[20]
refers in fact to halo 21, because if there is no result for latest()
then calculate_all
will eliminate the row.
For a robust test you should use
halo_nums, latest_nums = timestep.calculate_all("halo_number()", "latest().halo_number()")
then find halo 21 in halo_nums
and look up its corresponding value in latest_nums
. However in this particular case the bug in your test is theoretical -- it should have behaved as you expected, and in PR #128 I believe it will do.
Hi @apontzen. I can confirm that #128 fixes the problems I was seeing
Thanks for pointing out the bug in the test. Since all the halos in this database are linked to z=0, they all have a latest()
and I didn't think about the case when they wouldn't.
I'll close this issue, thanks for the quick work!
I am re-opening this issue, because latest()
calculated at the timestep level is now correct as far as I can tell, but with extremely degraded performance. It takes ~10min to compute latest()
for all halos in the example database I provided, compared to the few seconds it used to take.
I wanted to check whether this a necessary downside of having the new, more robust implementation?
Martin
I don't see this performance issue (would have noticed if it was taking 10 minutes!) - could you provide details?
Ok, so the main performance issue is probably that I am not trimming my merger trees at all in my local configuration:
mergertree_min_fractional_NDM = 0.0 # as a fraction of the most massive halo at each timestep - set to zero for no thinning
which probably amplifies any difference in performance.
However, here is a timing of latest()
from the latest commit on the current master branch, with the same (non)-trimming setup
In [1]: import timeit
In [2]: timestep = tangos.get_timestep("Halo599_DMO/output_00050")
In [3]: start_time = time.time()
...: timestep.calculate_all("latest()")
...: print("--- %s seconds ---" % (time.time() - start_time))
--- 604.0164349079132 seconds ---
And at commit b2d9b602164e49346856f34cdfa588e4114d42d1, i.e. before PR #128 but after the beginning of tinkering with earlier()
and later()
:
In [1]: import time
In [2]: timestep = tangos.get_timestep("Halo599_DMO/output_00050")
In [3]: start_time = time.time()
...: timestep.calculate_all("latest()")
...: print("--- %s seconds ---" % (time.time() - start_time))
--- 244.97739696502686 seconds ---
And finally at commit a8672ecb66d8f07bc0b50bea7eb1e3b0d26f9f7f, before PR #128 and PR #126
In [1]: import time
In [2]: timestep = tangos.get_timestep("Halo599_DMO/output_00050")
In [3]: start_time = time.time()
...: timestep.calculate_all("latest()")
...: print("--- %s seconds ---" % (time.time() - start_time))
--- 7.447889804840088 seconds ---
Thanks. #129 should significantly improve the performance again, through some quick wins. It probably will not return to the performance of https://github.com/pynbody/tangos/commit/a8672ecb66d8f07bc0b50bea7eb1e3b0d26f9f7f since the algorithm is in fact better and searches more of the tree. But it should be much better than the other two point checks you gave.
If you confirm, I will merge and push this, and also upload to pip since it's a very significant enhancement.
Hi Andrew, thanks for the quick work. I can confirm that the performance on #129 is much better for the same snippet of code.
In [1]: import timeit
In [2]: timestep = tangos.get_timestep("Halo599_DMO/output_00050")
In [3]: start_time = time.time()
...: timestep.calculate_all("latest()")
...: print("--- %s seconds ---" % (time.time() - start_time))
--- 26.69350028038025 seconds ---
As you mention, the performance is slightly worse than before #128 and #126, but it is back to very acceptable levels. I am very happy for you to merge #129, close this issue and push to pip whenever practical!
Hi all,
This is potentially a follow-up to #123, or might be an unrelated issue that I hadn't noticed before.
I am obtaining different results when using
latest()
for all halos in a timestep withcalculate_all
, compared to using it for each individual halo in turn. Most of the results are consistent between these two use cases, but I am attaching an example where they are not:Calculating
latest()
from Halo 21 at snapshot 95 returns a link to Halo 1 at z=0 (Out[5], which is correct), while it returns a link to Halo 26 when calculating it at the timestep level (Out[6]). I noticed this problem first while browsing the web interface, when clicking onto a halo from the timestep page would change the result fromlatest()
.The issue is not specific to this halo, I can reproduce it with other examples. Most examples that I have found failing are halos that merge with something else before z=0 and "disappear" into another halo. I can't spot anything obviously wrong with the database, and as an example, I checked that there are no links existing at any point of their history between my example Halo 21 and Halo 26 (the result from
latest()
at the timestep level).My guess is a difference in behaviour between the
MultiHop
strategy used for a single start point, and theMultiSource
strategy used for multiples halos at once, but I have found it very hard to pinpoint it more precisely.Any ideas or inputs? I can provide the database for diagnosis if needed.
Martin