Open baggepinnen opened 4 years ago
Is it just the presence/absence of @simd
that explains this?
In the first case it very well can be. In the second case I have a theory that the difference comes from findall(f,...)
having to reallocate a larger array every time the allocated space fills up, whereas the findall(x .!= ...)
always allocates a bitarray, but when this bitarray is known, the size of the final, larger output array can be calculated and allocated once. Interestingly, the situation in which the finall(f
approach saves very few allocations, whereas the other extreme it allocates quite a bit more. The bitarray approach is thus faster in both extreme cases and allocates less memory overall
I could not find an issue tracking this. I often find that higher-order functions fail to vectorize where the equivalent broadcast statement vectorizes well and is much faster. An example
another
The
map
example above could potentially be mitigated by implementingmap
in terms of broadcast for some set of mapped functions known to vectorize. The second example would be harder to specialize, and would requirefindall
to be able to specialize on its own. (all timings above are on juliav"1.5.0-DEV.130"
)