Open alchzh opened 1 year ago
All of the function names here probably need some adjusting.
Thanks. This is interesting and your implementation is very nicely done!
When I run into a similar issue, I just do something like:
while sim.t < end:
sim.integrate(sim.t + Delta_t)
output()
or
while sim.t < end:
sim.steps(N_steps)
output()
Your solution with callback functions is definitely more elegant. However, I think one downside is that it requires the user to understand more API calls, how python decorators work, etc. For example, I'd need to read the documentation to find out if the heartbeat gets called at the precise intervals that I'm giving it or wether it could be before or after, depending on where the timesteps fall. Similarly, what is the expected behaviour if the interval is shorter than a timestep? Also, it's not clear to me your implementation works for negative timesteps (integrating back in time). Finally, it's not clear to me what the expected behaviour is when using is_dt_multiple
together with adaptive timesteps, or if that even should be allowed.
In short, there are many complication and there is a question of whether the increased complexity outweighs any potential benefits.
Let me think about it for a while!
I didn't realize negative timestepping was officially supported! I was just using -1.0f as a magic number meaning "don't reb_output_check
". Maybe an infinity value or NaN is a suitable magic number instead?
I agree there are plenty of complications arising from exact intervals, interval less than timestep, etc. However, these are already the current practice of
void heartbeat(struct reb_simulation* r) {
if (reb_output_check(r, <interval>)) {
do_something();
}
}
The new changes exactly copies that behavior and let's you do it from Python performantly (whether that's ideal is another question). The old r->heartbeat = heartbeat
still works as before for C functions. My main focus is on Python.
I think heartbeat callbacks have an advantage over the manual integration loop when you have different things you want to do at different intervals, which comes up a lot examples/
but is harder to do manually. Also, the manual integration loop changes results when exact_finish_time
is on, which is another gotcha that users might not notice. I think it's a good convention that integration defaults to exact time, but callbacks for plotting, saving the simulation, outputting performance, etc. don't affect things.
I agree that the current interface with the CFUNCTYPE wrapping decorator is not the best. I will modify add_heartbeat
so it automatically converts Python functions when it gets them. I wasn't sure what the best way to detect python callable vs C function pointer was, but I'm sure it's not hard. Then sim.add_heartbeat(function, interval)
will work without any sugar.
I've changed add_heartbeat
so it takes either a C function pointer or Python function without any fuss. Negative timesteps and intervals should now work (I'm using NaNs as magic values, not sure if this is good practice). I also changed the setter for additional_forces
to do the pointer unwrapping since requiring sim_pointer.contents
is unintuitive.
Finally, it's not clear to me what the expected behaviour is when using is_dt_multiple together with adaptive timesteps, or if that even should be allowed.
It probably doesn't make much sense, but I was trying to replicate the existing example behavior (i.e. examples/drag_force
with IAS15).
I think what we should actually do is have a separate output checking function that looks at reb_simulation.steps_done
so we can actually inspect the simulation every N timesteps (which is valuable for adaptive timesteps!)
Wow. That was fast. Please slow down a little! I understand that it doesn't break any existing code or functionality, but I'm still not sure whether the increased complexity is worth it!
Well, going over it again, I like this new version much better! A few questions:
AFF()
. If we just keep the additional forces routine the way is was, what feature would we be missing?If the above is all resolved, I'm happy to merge it in. Let's not add it to the documentation / examples. I like to keep those as simple as possible and I think the very basic syntax I mentioned earlier will be sufficient for most users.
Wow. That was fast. Please slow down a little!
Sorry! The semester just started for me so I have nothing better to do...
Can we just getting rid of the decorator way of adding a heartbeat function. I get that decorators are a nice python feature, but no other part of REBOUND's user interface makes use of it. Having two different ways of doing the same thing makes the maintenance twice as cumbersome in the long term.
Yeah, I will remove it.
Because I don't understand all the details of ctypes function pointers, I am a little worried about changing anything. I remember it took us ages to fix an annoying bug where the problem was that we didn't keep the reference to what is now called AFF(). If we just keep the additional forces routine the way is was, what feature would we be missing?
The reference issue should be ok since I'm still keeping a reference in self._afp
as before. The new routine with deref_arg
(that's also used in add_heartbeat
) allows passing functions like
def fn(sim: Simulation):
do_something_with(sim);
rather than
def fn(sim_pointer: POINTER(Simulation)):
do_something_with(sim.contents);
as before. I think requiring users to use .contents
themselves is unintuitive, especially since there's nowhere else they have to interface with ctypes themselves. I think it's better for the python API to not expose any ctypes interaction publicly (i.e. Python functions should work on Python objects). There's nothing you can do with the POINTER in python except immediately dereference it anyway (or pass it to a C function, but that is more clear with an explicit byref
call).
I see. I agree, not having the .contents
syntax would be nice. But I still don't understand the details. If you dereference it, can you still do things like change dt
in the sim
object? Will it change it in the actual simulation or in a copy?
I believe part of the issue with the reference was that we needed to keep the reference to the function AFF = CFUNTYPE(...
itself (not to the function pointer AFF
returns). @dtamayo do you remember the details here?
In other words, what does this output:
import rebound
sim = rebound.Simulation()
sim.integrator = "leapfrog"
sim.add()
sim.dt = 1
def hb(simp):
simp.contents.dt = 2
sim.heartbeat = hb
sim.integrate(3,exact_finish_time=0)
print(sim.dt)
I see. I agree, not having the
.contents
syntax would be nice. But I still don't understand the details. If you dereference it, can you still do things like changedt
in thesim
object? Will it change it in the actual simulation or in a copy?
The dereferencing with .contents
in Python doesn't create a copy anywhere, it's access-by-reference. Python doesn't differentiate between sim_pointer.contents.dt = ...
and fn(sim_pointer.contents)
where fn modifies it, since the setter can't see up the dot chain it's called from.
I believe part of the issue with the reference was that we needed to keep the reference to the function
AFF = CFUNTYPE(...
itself (not to the function pointerAFF
returns). @dtamayo do you remember the details here?
Hmmm... I'd be interested if that was the case. ctypes
internally holds a reference to every type you create so I'm not sure it can ever be garbage collected (and I don't think a type can be GCd while an instance of it exists? AFF
is just a dynamically created type). https://github.com/python/cpython/blob/3.11/Lib/ctypes/__init__.py#L71
In other words, what does this output: ...
2.0
Great. That is a much nice syntax to have (which we need to update in the documentation and examples).
I agree that it looks like ctypes is keeping a reference. For some reason we decided 7 years ago that we need to move AFF (and other variables) outside the class definition (see 4ced0c257). Let's see if @dtamayo has a better memory.
Oof I really don't remember the details. But I could swear I remember this causing a problem, and trying to keep track of references to make a minimal working example that showed the issue. I was going to put it into that 2015 PyCon talk but I just looked at it and I think it was too technical and didn't make it into the talk. Sorry!
Well, it seems to work and I think it should work. So maybe we were just a bit naive and didn't really understand where the problem was coming from back then. I'd be willing to take the risk and use the nice syntax without the pointers that @alchzh has suggested.
We've dropped the ball on this one. I'm happy to think about it a bit more - just wary of breaking things if I don't fully understand the memory management stuff in python...
We've dropped the ball on this one. I'm happy to think about it a bit more - just wary of breaking things if I don't fully understand the memory management stuff in python...
I can rewrite this PR to not change the pointer access semantics and just implement the heartbeat stuff- I can't remember exactly from a year ago but it should work with the older style anyway...
Hm - I'd actually be more keen to merge in the pointer access semantics than the multiple heartbeat changes!
Current, rebound only accepts a single function pointer as a heartbeat callback. The current interface makes simulations using Python heartbeats very slow since they incur significant overhead on every timestep.
Most heartbeat functions are called either on every timestep, after a certain interval (using
reb_output_check
), or after a certain multiple of the simulation dt. In the latter two, I think Python heartbeat callbacks should be accepted.The solution is allowing multiple heartbeat listeners functions to be added and calling them automatically inside a
reb_output_check
on a specified interval. This avoids incurring Python call overhead on every timestep, and also creates what I think to be a nicer modular interface for heartbeats.TODO: create examples and docs