Skyscanner / aiotask-context

Contextual information for asyncio tasks
MIT License
161 stars 25 forks source link

Added usage of ChainMap as the context object #15

Closed altvod closed 7 years ago

altvod commented 7 years ago

Added support for ChainMap as the context object.

This way there's no need to copy any data, which can get painful in at least two cases:

  1. if you have some complex objects in the base context - deepcopy becomes not a very good idea
  2. you have a LOT of tasks inheriting from the main one (or even going deeper) - still results in a lot of extra data in memory

all this while preserving correct context inheritance behavior

argaen commented 7 years ago

Hi @altvod, never used ChainMap before so I'll take my time to review this MR.

Regarding memory usage, I have my doubts it will improve. In latest versions of Python dict is very optimized and it saves lots of space when objects are repeated. Regarding deepcopy could be a valid point although I've never benchmarked it

Thanks for the MR!

argaen commented 7 years ago

ChainMap looks interesting and very ideal for the usage where you don't want a shared context between tasks.

Unless I'm missing something (haven't finished the review). For a user, it will be exactly the same to use the dict_factory or the chainmap_factory. If that's the case, I say we could remove the dict_factory.

pfreixes commented 7 years ago

LGTM as well, as @argaen I will deprecate the deepcoopy in favor of the chainMap. deepcopy can be a bootleneck.

If there are concerns about the backward compatibility we can release a news major release.

BTW nice MR!

argaen commented 7 years ago

@pfreixes @altvod I've been thinking about this and actually its not the same behavior as the dict one.

Dicts:

>>> from copy import deepcopy
>>> context = {'object': ['1', '2']} 
>>> copy_context = deepcopy(context)
>>> copy_context['object'].append('3')
>>> copy_context
{'object': ['1', '2', '3']}
>>> context
{'object': ['1', '2']}

ChainMap

>>> from collections import ChainMap
>>> context = ChainMap({'object': ['1', '2']})
>>> child_context = context.new_child()
>>> child_context['object'].append('3')
>>> child_context
ChainMap({}, {'object': ['1', '2', '3']})     # parent modified!
>>> context
ChainMap({'object': ['1', '2', '3']})

so unless there is an option to modify this kind of behaviors, we can't deprecate one in favor of the other

pfreixes commented 7 years ago

Yups, no deprecation then

altvod commented 7 years ago

@argaen , you're exactly right - the behavior is different. It would've been more similar if shallow copy had been used instead of deep. You'd also see some differences if you ever decide to add a context.remove (although, whether that would be wise is a whole other topic)

I guess, the doc needs a couple examples of different context types "side-by-side". I can update the doc if you want, but I'd rather do it in a separate MR (or I can just let you guys do it).

BTW, did you consider generating docs for readthedocs.com from docstrings via sphinx? It works great for neat and compact libraries such as this one. Would be practically a self-updating documentation. And, besides, you're already using sphinx markup.

argaen commented 7 years ago

@altvod yup we should update the docs. Can you please update them in the same MR? I like to have code, tests and docs in the same MRs :). I'm OK with having a very detailed docstring in the factory of the use case and then having a simple mention/example in the README for now

Regarding readthedocs nope, In the beginning it was simple enough I didn't think about adding it. Now maybe it is worth it. I've created #16 for that, feel free to provide MR :).

altvod commented 7 years ago

Updated the README. I hope you don't mind that I modified an existing example a bit

altvod commented 7 years ago

As for RTD, maybe I will :) But at a later time.

argaen commented 7 years ago

Merged. I'll do a new release today or this weekend.

Thanks for your contribution! :)