Open ChrisRackauckas opened 4 years ago
This is largely solved by using the basesize
kwarg.
https://github.com/invenia/Parallelism.jl/blob/159a138d562a1583e44129340a0dfa784a34c523/src/tmap.jl#L36
It won't get to be fully as fast as using @threads
might, but it would get relatively close.
In general current implementation of tmap
assumes the work being done takes tens of seconds, or even minutes.
Thus even when the time taken to do each piece of work is theoretically identical, various other external factors are likely to put it out of sync anyway, and so one wants to use a basesize
that doesn't fully partition the work
https://github.com/invenia/Parallelism.jl/blob/master/src/tmap.jl#L10
That would have quite a bit of overhead. It would be nice to have a static schedule (
@threads
) version for fast running programs, which would then be nice once there's an adjoint definition for the map.