use (lazy) iterators instead of new vectors for subsetting

mhinsch commented 2 years ago

When selecting a subset of agents for subsequent iteration it is much more efficient to use lazy iterators (see Main.jl:run!) than to create a new vector containing the selected agents.

mhinsch commented 2 years ago

I just did a quick test - it's actually not that much more efficient, but probably still worth doing it.

AtiyahElsheikh commented 2 years ago

I can guess because adding agents to population (Array as List) is not that bad at least for the current simulation size (even obviously not the best solution)

AtiyahElsheikh commented 2 years ago

Initial usage of iterators (so far) did not (significantly) improve performance in comparison with usage of array of references, e.g.

selectedpeople = [person in people if criteria(person) ]

On the other side, usage of iterators hinder the usage of basic routines like length etc. for which an imitation (e.g. using counters) or an implementation does not look good and I guess this may even slow down performance a bit via hindering common low-level for loops optimization techniques.

mhinsch commented 2 years ago

The iterate function of Iterators.filter is tiny (9 lines) and type stable, so the chances are good that the compiler will just inline it. So, it's quite possible that the code that the compiler produces is essentially equivalent to:

for el in list
   if ! pred(el)
      continue
   end
   # ...
   # user code
   # ...
end

Counting the elements is as easy as count(x->true, list) (or you do it while iterating anyway).

So, at least for one-shot iterations lazy filter should always be at least as efficient as copying the required elements.

mhinsch commented 2 years ago

Hmm, I might actually be wrong (link). Anyway, for the simulation functions not much changes, so I guess we can do some systematic tests once we have the full simulation.

AtiyahElsheikh commented 2 years ago

In conclusion, the use of lazy iterators does not pay off avoiding the use of efficient O(1) high-level functions like length

mhinsch commented 2 years ago

In conclusion, the use of lazy iterators does not pay off avoiding the use of efficient O(1) high-level functions like length

Well, I think it's not that obvious either. I also just realised that I didn't read the code examples in the link above properly. They don't show what I thought they showed (the slow down the OP experienced had a different reason and one of the examples actually shows that lazy filter is faster).

With a temp copy you first have one iteration to copy the relevant elements, so you have N comparisons + log2(n) allocations + n copies (which are cheap though, as only pointers need to be copied) plus another n iterations (through a contiguous array, though) to process the selected elements.

With a lazy iterator you iterate once with N comparisons as well, plus do n processing steps. If you count separately you have another N comparisons, which might well be more expensive than the allocations plus n copies of the other approach. If you count inline (during processing), however, or your selected elements make up a large proportion of your original array (and your comparison is not too expensive) or even just if your arrays are large it might be the other way around.

AtiyahElsheikh commented 2 years ago

If I understand both concepts correctly, then something like pipelining and low-level for loop techniques would be better applicable, in presence of copies of references array and in the absence of if and branching.

The other thing is, I am suspicious about hidden Julia-based techniques that either kind of re-use of the already allocated memory from the last iteration. This doubt comes from seeing that memory allocation / memory usage did not change in most of the cases. I had a case as I remember where I noticed a slightly additional memory usage for array of references.

AtiyahElsheikh commented 2 years ago

either .. or special optimisation techniques with generators and aggregations etc.

mhinsch commented 2 years ago

I think we just have to do some proper benchmarking/profiling at some point.

MRC-CSO-SPHSU / LoneParentsModel.jl

use (lazy) iterators instead of new vectors for subsetting #65