Open FrederikSchaff opened 6 years ago
I understand your point. It can be useful in certain situations to be able to randomize the order of evaluation of variables, when more than one sequence is available. But there is a big if to be answered before: are the possible alternative computation sequences equivalent (in terms of representing the same model)? The general answer is no. Quite frequently, even a deterministic model (no stochastic components) produce different results if computed in using different valid sequences. Currently, LSD pick the first "route" it finds and keep it. Is this the "best" one? We don't know, but there is a simple strategy in place (see below) and one consistent result. Which are really the alternative routes? It depends heavily in the model structure itself. How could you reproduce your results or perform a Monte Carlo experiment in this situation? Imagine tracking a bug when every execution can take a different route? Or trying to understand a complex emerging result? Last, but pragmatically most important, this would require a complete redesign of the LSD scheduler. Currently, we use a recursive scheduler: we pick the first variable at the topmost object and compute it entirely, including any dependence it may have. Next, the second topmost variable is computed, if not already evaluated when the first one was calculated. And so on until the last variable in the last object at the lowest level is updated, directly (in the top-down order) or not (previously computed by an upstream update). In a typical model, more than 80% of the variables are updated by recursion, on-the-fly, and such order cannot be controlled programatically. And the other 20% are more likely statistics-gathering variables, to which the computation order is not relevant anyway (but cannot be computed before its dependencies). For all those reasons, I guess such a change may be of limited value despite incurring in significant coding and testing demands. On the other hand, designing the relevant (as per the model logic) multi-instance variables to be computed in random order is relatively simple to implement. If you have a particular situation in mind, let us know so we can suggest an approach. Most of the time, randomly sorting the desired object chain every time step would do. As a side note, maybe just allowing the parallel computation of the desired multi-instance variable will produce the desired results in some situations. When computing these variables in parallel there is not an strict similar updating sequence every time step.
I will try to respond to all issues raised.
The biggest point are the pragmatic issues. I think it is possible to find a solution. A first simple help would be a random-order cycle (and cycle-safe). I have implemented something in the GIS enhancement I am working on and could implement it right away. This would make manual control more easy. But if you agree, I could also think about a complete randomisation of the default updating scheme (with a switch to turn it on/off).
I am sure that if we want to get more people on using LSD, this is a crucial point.
Hi Frederik. Thanks for the detailed analysis.
I'm still not convinced and, anyway, I don't have the time to invest on this direction for now, as it is a significant effort. However, feel free to adapt LSD to your needs. If you come up with a solution that is end-user-ready, we'd be very happy to incorporate into mainline LSD.
Just to provide more basis to my feelings, some comments on your points (no answer is necessary).
In cases where current information is relevant for the process, the fixed-order updating in LSD is problematic. Currently this is circumvented by providing an explicit cycle over objects and doing this in a random fashion. It would be good if instead one could flag objects as "randomised" (or this was the default). Also, a random cycle macro would be useful.
One way to implement this could be to change the current updating scheme as follows: First, create a quee of all the variables that shall be updated. Next, randomise it. Then process it as usual. Problems like copying with deleted objects have to be taken care of. This could be done by combining it with the delete flag suggested in the issue #9 "Delete Self"