Pytest hot reloader issues

JamesHutchison commented 11 months ago

First off, thanks for this great library. I've used it to implement a pytest hot reloading daemon. I had some issues that I worked around using monkey patching, and I think it would make sense to update the library rather than use monkey patches.

Here's the list of issues I needed to work around:

glob - needed more control over directories. You may have so many files, attempting to watch them all would exhaust the open file handle limit, so the ability to specify multiple root directories was needed.
assertion rewrites - pytest does assertion rewrites. When jurigged would hot reload a test function, the test function would no longer have the assertions rewritten. I think the best solution to this would be to add hook capability on reevaluate
line numbers off by one - I encountered this in two places. The first was rewritten functions having their line numbers shifted down by one line. The second issue is where there is an ensure_separation parameter in GroupDefinition.append that inserts a blank line. This creates line breaks where they don't exist, resulting in line numbers diverging after certain things in the file. It's not clear to me the purpose of this. When line numbers are off, the debugger and introspection libraries like varname quit working as expected, on top of errors giving the wrong line number.
jurigged logger - the default behavior was resulting in duplicated output
watchman - this isn't really on jurigged specifically but watchman seems to have compatibility problems so I use polling. I've been wanting to create a library that does file watching better by doing a hybrid of signals and polling but haven't gotten around to it.

Here's the full code of the monkey patches:

def monkey_patch_jurigged_function_definition():
    import jurigged.codetools as jurigged_codetools  # type: ignore
    import jurigged.utils as jurigged_utils  # type: ignore

    OrigFunctionDefinition = jurigged_codetools.FunctionDefinition

    import ast

    class NewFunctionDefinition(OrigFunctionDefinition):
        def reevaluate(self, new_node, glb):
            # monkeypatch: The assertion rewrite is from pytest. Jurigged doesn't
            #              seem to have a way to add rewrite hooks
            new_node = self.apply_assertion_rewrite(new_node, glb)
            obj = super().reevaluate(new_node, glb)
            return obj

        def apply_assertion_rewrite(self, ast_func, glb):
            from _pytest.assertion.rewrite import AssertionRewriter

            nodes: list[ast.AST] = [ast_func]  # type: ignore
            while nodes:
                node = nodes.pop()
                for name, field in ast.iter_fields(node):
                    if isinstance(field, list):
                        new: list[ast.AST] = []  # type: ignore
                        for i, child in enumerate(field):
                            if isinstance(child, ast.Assert):
                                # Transform assert.
                                new.extend(
                                    AssertionRewriter(glb["__file__"], None, None).visit(child)
                                )
                            else:
                                new.append(child)
                                if isinstance(child, ast.AST):
                                    nodes.append(child)
                        setattr(node, name, new)
                    elif (
                        isinstance(field, ast.AST)
                        # Don't recurse into expressions as they can't contain
                        # asserts.
                        and not isinstance(field, ast.expr)
                    ):
                        nodes.append(field)
            return ast_func

        def stash(self, lineno=1, col_offset=0):
            # monkeypatch: There's an off-by-one bug coming from somewhere in jurigged.
            #              This affects replaced functions. When line numbers are wrong
            #              the debugger and inspection logic doesn't work as expected.
            if not isinstance(self.parent, OrigFunctionDefinition):
                co = self.get_object()
                if co and (delta := lineno - co.co_firstlineno):
                    delta -= 1  # fix off-by-one
                    if delta != 0:
                        self.recode(jurigged_utils.shift_lineno(co, delta), use_cache=False)

            return super(OrigFunctionDefinition, self).stash(lineno, col_offset)

    # monkey patch in new definition
    jurigged_codetools.FunctionDefinition = NewFunctionDefinition

def monkeypatch_group_definition():
    import jurigged.codetools as jurigged_codetools  # type: ignore

    def append(self, *children, ensure_separation=False):
        for child in children:
            # ensure_separation creates line number diff
            # an example where this was a problem:
            #
            # 15 class MyClass:
            # 77     do_something()  # type: ignore  <--- blank line inserted between do_something() and comment
            # 78
            # 79     def my_func(...)  <--- becomes line 80
            #
            # the monkey patch removes it
            #
            # removed code:
            # if (
            #     ensure_separation
            #     and self.children
            #     and not self.children[-1].well_separated(child)
            # ):
            #     ws = LineDefinition(
            #         node=None, text="\n", filename=self.filename
            #     )
            #     self.children.append(ws)
            #     ws.set_parent(self)
            self.children.append(child)
            child.set_parent(self)

    jurigged_codetools.GroupDefinition.append = append

def setup_jurigged(config: Config):
    def _jurigged_logger(x: str) -> None:
        """
        Jurigged behavior is to both print and log.

        By default this creates duplicated output.

        Pass in a no-op logger to prevent this.
        """

    import jurigged

    monkey_patch_jurigged_function_definition()
    monkeypatch_group_definition()

    pattern = _get_pattern_filters(config)
    # TODO: intelligently use poll versus watchman (https://github.com/JamesHutchison/pytest-hot-reloading/issues/16)
    jurigged.watch(pattern=pattern, logger=_jurigged_logger, poll=True)

JamesHutchison commented 11 months ago

I also have an issue about mitigating / improving the start-up time impact:

https://github.com/JamesHutchison/pytest-hot-reloading/issues/17

JamesHutchison commented 11 months ago

Again, thanks again for the library! At Carta I wrote a hot reloader as a hackathon project (not in plugin form, but something much cruder) and also showed people how to run the big monolith using jurigged and they loved it!

breuleux commented 7 months ago

Hi James!

First, I'm so sorry for the very late reply to this, I'm evidently not very good at responsive maintenance 😅 I'll try to be better.

Second, I can't reproduce the off-by-one problem with line numbers. When I test with a toy program (using varname), the line numbers are tracked perfectly... and when I apply your patch, then it starts messing up. There must be some specific edge case here that you're hitting. Could you provide a self-contained script and a patch to that script which, when applied, triggers the issue? Thank you!

JamesHutchison commented 7 months ago

I am able to reproduce the issue pretty easily using the Heavy Stack.

https://github.com/heavy-resume/heavy-stack/

If you have github codespaces, simply create a new codespace, run Hot Reload Heavy Stack. Note that you may need to wait a 5 - 10 minutes for the codespace to finish getting set up. You can check the .dev_container_logs/postStartBackground.out to see if the postgres image finished building.

Set a break point on the first line of the LandingPage component at heavy_stack/frontend/landing.py:32. Open up port 8000 on the dialog that presents itself to view the server. This will trigger that component. Note that user is not one of the locals. Change something on the LandingPage component. It'll hot reload and rerender the component. The breakpoint should trigger again. Note that the user is now in the locals, indicating that it's actually setting the breakpoint one line later than it should be.

This screenshot illustrates the issue.

jurigged-issue

It's worth calling out that I didn't understand this issue well and didn't feel like the "off-by-one" fix was actually a good fix. It probably fixed something and broke something. I think I still encountered line number issues. Looking at it right now, I wouldn't be surprised if the issue here is that the line number for the function starts at the decorator rather than the def.

JamesHutchison commented 7 months ago

Actually... yeah thats probably whats going on. If I do this nutso thing the offset gets bigger.

@(
    component
)
def LandingPage() -> Component:
    user, set_user = use_state(cast(User | None, None))
    right_now, set_right_now = use_state(utc_now())

JamesHutchison commented 7 months ago

It's worth calling out the functions where I encountered this, pytest functions, there wasn't a decorator, so I'm wondering if there's a bit more to this. Perhaps I had something like this:

class MyTest
  @pytest.fixture(autouse=True)  # maybe this was breaking everything later in the file
  def setup(self):
    ...

  def test_where_doing_hot_reload_was_off_by_one(...)
     ...

breuleux commented 7 months ago

There is indeed a problem when there are decorators. More precisely it's this code which explicitly adds them to the node's extent: https://github.com/breuleux/jurigged/blob/master/jurigged/codetools.py#L863

The thing is, I'm pretty sure I did this for a reason, but if I remove it, all tests still pass. It was possibly related to jurigged's capability to apply patches that don't come from file modifications and write back the patches to the files (I had an editor working a few years back where you could edit function definitions piecewise in an interactive session), but until I figure out why this exists, I'll just remove it and hopefully that'll resolve the problem.

breuleux commented 7 months ago

Upon investigation, I think it's a lot more complicated, and the above fix appears to work only sometimes, and accidentally. In other cases it messes up the line updates for the other functions. The first line of a function definition does normally include the decorator, except, for some maddening reason, the initial @, so there was some code to force its inclusion.

My current hypothesis is that the issue happens because when a function is reevaluated, we purposefully do not want to reevaluate the decorators, and to ensure this, they are stripped from the AST node that we then recompile. This might introduce some kind of inconsistency that messes up the line numbers down the line. I'll have to take a deeper look another time.

JamesHutchison commented 7 months ago

This seems promising, although I vaguely feel like months ago I had tried this exact code and somehow it wasn't working somewhere

@dataclass
class FunctionDefinition(GroupDefinition):
    _codeobj: object = None

    ##############
    # Management #
    ##############

    def stash(self, lineno=1, col_offset=0):
        if not isinstance(self.parent, FunctionDefinition):
            co = self.get_object()
            if co and (delta := lineno - self.node.extent.lineno):
                self.recode(shift_lineno(co, delta), use_cache=False)

        return super().stash(lineno, col_offset)

I'll update the pytest hot reloader's monkeypatch and also monkeypatch the heavy stack and see if anything breaks

JamesHutchison commented 7 months ago

Here's the results of testing with that change:

✔️ Changing no decorator
✔️ Changing with decorator
✔️ Tests
✔️ Multiline decorator, changing function itself
❌ Multiline decorator, change length of function then put breakpoint on function later in the file

For that last one, making a change to the function later in the file seems to fix the issue.

JamesHutchison commented 7 months ago

Seeing this error which I think is unrelated, but posting it just in case:

Traceback (most recent call last):
  File "/workspaces/heavy-resume/.venv/lib/python3.11/site-packages/jurigged/codetools.py", line 480, in _process_child_correspondence
    self.evaluate_child(new)
  File "/workspaces/heavy-resume/.venv/lib/python3.11/site-packages/jurigged/codetools.py", line 646, in evaluate_child
    return child.evaluate(self.get_globals(), attrproxy(obj))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/heavy-resume/.venv/lib/python3.11/site-packages/jurigged/codetools.py", line 590, in evaluate
    obj.__qualname__ = ".".join(self.dotpath().split(".")[1:])
    ^^^^^^^^^^^^^^^^
AttributeError: 'method' object has no attribute '__qualname__'

Bizarre

    def evaluate(self, glb, lcl):
        super().evaluate(glb, lcl)
        obj = (lcl or glb).get(self.name, None)
        if hasattr(obj, "__qualname__"):
            obj.__qualname__ = ".".join(self.dotpath().split(".")[1:])

JamesHutchison commented 6 months ago

Since this is an improvement over the existing off-by-one fix in the hot reloader I'm going to go ahead and update that library's monkeypatch. IMO, it's pretty good

breuleux commented 6 months ago

@JamesHutchison I just tried your change. It seems to work fine for the first source code modification, but things seem to get mixed up on subsequent modifications (the tests in test_generated perform a bunch of random code edits in sequence and most end up crashing as they try to set the code to a negative line number -- these tests are brutal, but I'm glad I made them, they've uncovered so many bugs). Not sure why. I'm going to have to add a battery of tests for line number tracking that cover all the cases I'm currently checking manually, there's no way we're ever going to fix this otherwise.

JamesHutchison commented 6 months ago

That's awesome to hear! This library is so helpful, and fixing the line number issue and catching stability issues is a big win!

breuleux / jurigged

Pytest hot reloader issues #29