python / cpython

The Python programming language
https://www.python.org
Other
63.14k stars 30.23k forks source link

concurrent_futures Executor.map semantics better specified in docs #70562

Closed 10fc5b7d-b64e-4e0e-a113-28487d06da3c closed 5 years ago

10fc5b7d-b64e-4e0e-a113-28487d06da3c commented 8 years ago
BPO 26374
Nosy @brianquinlan, @mdickinson, @tirkarthi

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-feature', 'docs'] title = 'concurrent_futures Executor.map semantics better specified in docs' updated_at = user = 'https://bugs.python.org/FDSacerdoti' ``` bugs.python.org fields: ```python activity = actor = 'bquinlan' assignee = 'docs@python' closed = True closed_date = closer = 'bquinlan' components = ['Documentation'] creation = creator = 'F.D. Sacerdoti' dependencies = [] files = [] hgrepos = [] issue_num = 26374 keywords = [] message_count = 7.0 messages = ['260396', '260477', '260478', '260508', '326161', '341619', '341623'] nosy_count = 5.0 nosy_names = ['bquinlan', 'mark.dickinson', 'docs@python', 'F.D. Sacerdoti', 'xtreak'] pr_nums = [] priority = 'normal' resolution = None stage = 'resolved' status = 'closed' superseder = None type = 'enhancement' url = 'https://bugs.python.org/issue26374' versions = ['Python 3.6'] ```

10fc5b7d-b64e-4e0e-a113-28487d06da3c commented 8 years ago

Hello,

My colleague and I have both written parallel executors for the concurrent_futures module, and are having an argument, as described in the dialog below. To resolve, I would like to add "order of results is undefined" to disambiguate the docs for "map(func, *iterables, timeout=None)".

DISCUSSION

Q: Correct Semantics to return results out of order? JH: No, breaks API as stated Rebut: order is undefined, concurrent_futures specifies map() returns an iterator, where builtin map returns a list. Q: Does it break the spirit of the module? A: No, I believe one of the best things about doing things async is the dataflow model: do the next thing as soon as its inputs are ready. Q: Should we hold up the caller in all cases when there are stragglers, i.e. elements that compute slower? A: No, the interface should allow both modes.

def james_map(exe, fn, *args):
  return iter( sorted( list( exe.map( fn, *args ) ) ) )
mdickinson commented 8 years ago

The documentation says: "Equivalent to map(func, *iterables)". I believe that that equivalency implies that the ordering *is* defined, so it would be incorrect to add "order of results is undefined" to the documentation.

mdickinson commented 8 years ago

Note also this code snippet from PEP-3148:

for number, prime in zip(PRIMES, executor.map(is_prime,
                                                  PRIMES)):

The use of zip here suggests strongly that the intention is that the order of the map result is well-defined.

It's possible that the docs should be updated to make the ordering requirement clearer.

mdickinson commented 8 years ago

I just noticed this point, which may be confusing things:

Rebut: order is undefined, concurrent_futures specifies map() returns an iterator, where builtin map returns a list.

In Python 3, the built-in map function returns an iterator, not a list.

tirkarthi commented 6 years ago

There were some improvements made that clarify differences between builtin map with https://bugs.python.org/issue32306 and https://github.com/python/cpython/commit/a7a751dd7b08a5bb6cb399c1b2a6ca7b24aba51d

Thanks

brianquinlan commented 5 years ago

Can we close this bug then?

tirkarthi commented 5 years ago

I would propose closing since the original doc issue regarding order and map in Python 3 is resolved. Just to add there is a PR to make map less eager : https://github.com/python/cpython/pull/707/