Knio / dominate

Dominate is a Python library for creating and manipulating HTML documents using an elegant DOM API. It allows you to write HTML pages in pure Python very concisely, which eliminate the need to learn another template language, and to take advantage of the more powerful features of Python.
GNU Lesser General Public License v3.0
1.72k stars 108 forks source link

Dominate doesn't work with async code (+ potential fix) #185

Closed tlonny closed 11 months ago

tlonny commented 11 months ago

Hi guys, huge dominate fan - thanks for all the work you guys have put in developing it!

I had assumed that dominate wouldn't work in an async context due to how tags are globally bound in the dom_tag._with_contexts dictionary. I confirmed my hypothesis with this pathological piece of code:

from dominate import tags
from asyncio import sleep, gather, run

async def tag_routine():
    root = tags.div(id = 1)
    with root:
        await sleep(5)
        tags.div(id = 2)
    return str(root)

async def tag_routine2():
    root = tags.div(id = 3)
    await sleep(2)
    with root:
        tags.div(id = 4)
    return str(root)

async def both():
    tag_1, tag_2 = await gather(tag_routine(), tag_routine2())
    print("Tag #1")
    print(tag_1)

    print("Tag #2")
    print(tag_2)

run(both())

You would hope to see:

Tag #1
<div id="1">
  <div id="2"></div>
</div>
Tag #2
<div id="3">
  <div id="4"></div>
</div>

But instead we see:

Tag #1
<div id="1">
  <div id="3">
    <div id="4"></div>
  </div>
  <div id="2"></div>
</div>
Tag #2
<div id="3">
  <div id="4"></div>
</div>

I believe the fix for such an issue is to tweak the _get_thread_context() function. What I propose is something that looks like this:

context_var = contextvars.ContextVar("var", default = None)
context_id = 0

def _get_context_id():
  global context_id
  if context_var.get() is None:
    context_var.set(context_id)
    context_id += 1
  return context_var.get()

def _get_thread_context():
  context = [threading.current_thread()]
  if greenlet:
    context.append("greenlet")
    context.append(greenlet.getcurrent())

  if asyncio.get_running_loop() is not None:
    context.append("async")
    context.append(_get_context_id())

  return hash(tuple(context))

My idea is that _get_context_id will return a new ID for each "execution chain" (idk the correct terminology), which will distinguish between two async functions run concurrently. We add this to the context only if we're currently running in an event loop. I monkeypatched your code with the above and the pathological example above worked as expected.

I've never contributed to OSS so LMK if I should turn this into a PR etc.

Thanks!

Tim

P.S. I attach a monkey patch solution for anyone else running into this issue: https://gist.github.com/tlonny/de54ed6ba2481df5f841513d9a9c7ea3

Knio commented 11 months ago

Your approach looks good to me - please turn this into a PR :) (with your example as a unit test too!)

tlonny commented 11 months ago

@Knio - Thanks for your message. I have a branch with commented code + unit tests but I'm unable to push it:

ERROR: Permission to Knio/dominate.git denied to tlonny.

What do I need to do?

tlonny commented 11 months ago

Okay, never mind - I rtfm'd and I've created the PR. Please let me know your thoughts and hope you like it! =D