HyperionGray / trio-chrome-devtools-protocol

Trio driver for Chrome DevTools Protocol (CDP)
MIT License
60 stars 17 forks source link

Redesign API? #1

Closed mehaase closed 4 years ago

mehaase commented 4 years ago

The API feels cumbersome. Each call to the server has to be wrapped in await conn.execute(...) or await session.execute(...) which is quite a bit of boilerplate to type, it makes lines of code longer (and more likely to need wrapping), and it seems to prevent type inference of the result[1].

It might be better to adopt a code generation approach as in PyCDP. But PyCDP generates Python code, we don't need to parse the CDP spec in this project. Instead, we can use introspection on PyCDP to search for command methods and generate new wrappers for them.

One possible design is to add a connection/session argument to each command, i.e. page.capture_screenshot(format='png') would become await page.capture_screenshot(session, format='png'). The body of the generated function would be really simple:

async def capture_screenshot(session: CdpBase, url: str) -> str:
    ret = await session.execute(cdp.page.navigate(url))
    return typing.cast(str, ret)

(The cast is there to help type inference, but a good type checker might not need it?)

This design would solve all 3 problems described in this issue, but it complicates the documentation a bit. Right now it's easy to explain: you can execute() any command in PyCDP. The new design either requires a separate sets of docs for Trio CDP, or else explain that all PyCDP commands are valid in Trio CDP, but they are now async and have an additional argument. I don't know which one is better...

[1] I have only tested with JEDI, which doesn't seem to understand that execute() always returns the same type as its argument. Kite may have better results. I'll test with it later.

auphofBSF commented 4 years ago

Please see PR#6 as a possible candidate for exploring implementation Thanks for the great effort done across many modules

mehaase commented 4 years ago

PR #6 is a really nice jumping off point. It is different from what I had originally been leaning towards, but after trying it out in a Jupyter notebook I like the direction that it's going.

The biggest drawback to that PR is that it uses a singleton session manager and would prevent usage of the simplified API in any code that needs to support multiple, concurrent sessions. For the sake of argument, let's say we modified the examples/get_title.py to use multiple tabs in order to grab the titles of multiple websites concurrently. Here's an example using the PR6 style:

# Start concurrent tasks to get titles
nursery.start_soon(title_task, conn, target1)
nursery.start_soon(title_task, conn, target2)
nursery.start_soon(title_task, conn, target3)
...

async def title_task(conn, target):
    # open session to target and get its title
    session = await conn.open_session(target.target_id)
    CDP_Active_Session_Manager.session = session
    simplify_cdp_domain_calls(dom)
    print (await get_title())

async def get_title():
    doc = dom.get_document()
    node = await dom.query_selector(doc, 'title')
    return await dom.get_outer_html(node)

The use of the simplified API in get_title() means that each session will race to set itself as the singleton session used by the API, and concurrent calls to title_task() could return the title for the wrong page.

But I think there's a small twist on this that would work nicely:

# Start concurrent tasks to get titles
nursery.start_soon(title_task, conn, target1)
nursery.start_soon(title_task, conn, target2)
nursery.start_soon(title_task, conn, target3)
...

async def title_task(conn, target):
    # open session to target and get its title
    async with conn.open_session(target.target_id):
        print (await get_title())

async def get_title():
    doc = dom.get_document()
    node = await dom.query_selector(doc, 'title')
    return await dom.get_outer_html(node)

In this scenario, we replace the singleton session with a context manager. All of the code called from that context manager has an implicit session variable that is used to execute commands. How does this work under the hood? The context manager uses a context variable to hold the current session, and the decorator uses the context var to find the session. This approach keeps the clean calling syntax of PR6 plus adds the ability to support concurrent sessions.

auphofBSF commented 4 years ago

@mehaase , Brilliant, I think I agree with your approach, As I have been reading a bit more I like wise have come to the conclusion that contexts would be the way to go, looks elegant.