Closed mhinsch closed 1 year ago
I just did a quick test - it's actually not that much more efficient, but probably still worth doing it.
I can guess because adding agents to population (Array as List) is not that bad at least for the current simulation size (even obviously not the best solution)
Initial usage of iterators (so far) did not (significantly) improve performance in comparison with usage of array of references, e.g.
selectedpeople = [person in people if criteria(person) ]
On the other side, usage of iterators hinder the usage of basic routines like length etc. for which an imitation (e.g. using counters) or an implementation does not look good and I guess this may even slow down performance a bit via hindering common low-level for loops optimization techniques.
The iterate function of Iterators.filter is tiny (9 lines) and type stable, so the chances are good that the compiler will just inline it. So, it's quite possible that the code that the compiler produces is essentially equivalent to:
for el in list
if ! pred(el)
continue
end
# ...
# user code
# ...
end
Counting the elements is as easy as count(x->true, list)
(or you do it while iterating anyway).
So, at least for one-shot iterations lazy filter should always be at least as efficient as copying the required elements.
Hmm, I might actually be wrong (link). Anyway, for the simulation functions not much changes, so I guess we can do some systematic tests once we have the full simulation.
In conclusion, the use of lazy iterators does not pay off avoiding the use of efficient O(1) high-level functions like length
In conclusion, the use of lazy iterators does not pay off avoiding the use of efficient O(1) high-level functions like length
Well, I think it's not that obvious either. I also just realised that I didn't read the code examples in the link above properly. They don't show what I thought they showed (the slow down the OP experienced had a different reason and one of the examples actually shows that lazy filter is faster).
With a temp copy you first have one iteration to copy the relevant elements, so you have N comparisons + log2(n) allocations + n copies (which are cheap though, as only pointers need to be copied) plus another n iterations (through a contiguous array, though) to process the selected elements.
With a lazy iterator you iterate once with N comparisons as well, plus do n processing steps. If you count separately you have another N comparisons, which might well be more expensive than the allocations plus n copies of the other approach. If you count inline (during processing), however, or your selected elements make up a large proportion of your original array (and your comparison is not too expensive) or even just if your arrays are large it might be the other way around.
If I understand both concepts correctly, then something like pipelining and low-level for loop techniques would be better applicable, in presence of copies of references array and in the absence of if and branching.
The other thing is, I am suspicious about hidden Julia-based techniques that either kind of re-use of the already allocated memory from the last iteration. This doubt comes from seeing that memory allocation / memory usage did not change in most of the cases. I had a case as I remember where I noticed a slightly additional memory usage for array of references.
either .. or special optimisation techniques with generators and aggregations etc.
I think we just have to do some proper benchmarking/profiling at some point.
When selecting a subset of agents for subsequent iteration it is much more efficient to use lazy iterators (see Main.jl:run!) than to create a new vector containing the selected agents.