beartype / beartype

Unbearably fast near-real-time hybrid runtime-static type-checking in pure Python.
https://beartype.readthedocs.io
MIT License
2.59k stars 55 forks source link

[Feature Request] Preserve `inspect.isgeneratorfunction()`-ness (e.g., for Gradio integration) #423

Open pablovela5620 opened 3 weeks ago

pablovela5620 commented 3 weeks ago

I'm trying to use beartype with gradio, it works great assuming that one is NOT using a generator. Otherwise, if using beartype and yield things breakdown. I wrote a minimal repo that reproduces the issue https://github.com/pablovela5620/beartype-gradio

Here is the failing python file

import gradio as gr
import time
from typing import Generator

def yield_message_succeeds(message):
    final_message = ""
    for char in message:
        time.sleep(0.2)
        final_message += char
        yield final_message

def yield_message_fails(message: str) -> Generator[str, None, None]:
    final_mesage = ""
    for char in message:
        time.sleep(1)
        final_mesage += char
        yield final_mesage

with gr.Blocks() as demo:
    message = gr.Textbox(label="Message", value="This is a yield test!")
    output = gr.Textbox(label="Output")
    output_failed = gr.Textbox(label="Output Failed")
    with gr.Row():
        btn_succeeds = gr.Button(value="Succeeds")
        btn_fails = gr.Button(value="Fails")
    btn_succeeds.click(fn=yield_message_succeeds, inputs=[message], outputs=[output])
    btn_fails.click(fn=yield_message_fails, inputs=[message], outputs=[output_failed])

so long as I don't type annotate the function inputs/outputs everything works fine (I can add type hints inside the function itself and beartype/gradio works as expected). But when I add type hints to inputs/outputs, things break.

I narrowed it down to https://github.com/gradio-app/gradio/blob/4a8555921980a30d2621c5eb7d6700704f2561ab/gradio/utils.py#L800

where elif inspect.isgeneratorfunction(f): is resulting in a false value when submodule type checking with beartype is used, but when not using beartype type checking, returns true (as it should)

# At the very top of your "{your_package}.__init__" submodule:
from beartype.claw import beartype_this_package  # <-- boilerplate for victory
beartype_this_package()                          # <-- yay! your team just won

I'm not really sure why this is happening and if this is a beartype issue or a gradio issue

leycec commented 3 weeks ago

Ugh. Awesome MLops webdev framework hates @beartype, huh? Sadly, I can't actually reproduce this on my end. Your wonderful example runs fine for me:

import gradio as gr
import time
from beartype import beartype
from typing import Generator

@beartype
def yield_message_succeeds(message):
    final_message = ""
    for char in message:
        time.sleep(0.2)
        final_message += char
        yield final_message

@beartype
def yield_message_fails(message: str) -> Generator[str, None, None]:
    final_mesage = ""
    for char in message:
        time.sleep(1)
        final_mesage += char
        yield final_mesage

with gr.Blocks() as demo:
    message = gr.Textbox(label="Message", value="This is a yield test!")
    output = gr.Textbox(label="Output")
    output_failed = gr.Textbox(label="Output Failed")
    with gr.Row():
        btn_succeeds = gr.Button(value="Succeeds")
        btn_fails = gr.Button(value="Fails")
    btn_succeeds.click(fn=yield_message_succeeds, inputs=[message], outputs=[output])
    btn_fails.click(fn=yield_message_fails, inputs=[message], outputs=[output_failed])
    print('ok')

That prints "ok". Admittedly, that's all that prints. Gradio doesn't seem to be doing anything. Presumably, Gradio is doing something and I should just trust that Gradio behaves as expected.

Would you mind upgrading to the newest release candidate of @beartype on your end and trying again? I acknowledge that this is annoying and apologize for all the confusion here. In theory, this should do it:

pip install --upgrade --pre beartype

If that works for you, we rejoice. I'll be officially releasing @beartype 0.19.0 in a week or two, which should put these sorts of issues to rest... finally. :face_exhaling:

pablovela5620 commented 3 weeks ago

Thank you for the quick response! I apologize for the confusion, I should have been more descriptive about the issue. It's not that the example won't run, it's that the function called to the Gradio interface incorrectly runs/fails. I can provide an example gif of what I mean.

also one needs to include demo.launch(), I didn't have it in my script but it is a part of the full example I provided https://github.com/pablovela5620/beartype-gradio

beartype-gradio

I have tried with both beartype==0.18.5 and beartype==0.19.0rc0 and get the same behavior. As you can see, when no type annotations are added to the function called via gradio, things work just fine. The message is yielded no problem, but when type annotations are added and the init.py function includes

# At the very top of your "{your_package}.__init__" submodule:
from beartype.claw import beartype_this_package  # <-- boilerplate for victory

beartype_this_package()  # <-- yay! your team just won

then I get the generator object instead.

I believe it has to do with this https://github.com/gradio-app/gradio/blob/4a8555921980a30d2621c5eb7d6700704f2561ab/gradio/utils.py#L800

def function_wrapper(
    f: Callable,
    before_fn: Callable | None = None,
    before_args: Iterable | None = None,
    after_fn: Callable | None = None,
    after_args: Iterable | None = None,
):
   ...

    elif inspect.isgeneratorfunction(f):

        @functools.wraps(f)
        def gen_wrapper(*args, **kwargs):
            iterator = f(*args, **kwargs)
            while True:
                if before_fn:
                    before_fn(*before_args)
                try:
                    response = next(iterator)
                except StopIteration:
                    if after_fn:
                        after_fn(*after_args)
                    break
                if after_fn:
                    after_fn(*after_args)
                yield response

        return gen_wrapper

    else:

        @functools.wraps(f)
        def wrapper(*args, **kwargs):
            if before_fn:
                before_fn(*before_args)
            response = f(*args, **kwargs)
            if after_fn:
                after_fn(*after_args)
            return response

        return wrapper

When calling with typehints and beartype, for some reason in the elif statement

inspect.isgeneratorfunction(f) == False

where it SHOULD be True which is why I'm getting this incorrect behavior. Since I'm getting the generator back instead of getting the yielded values

image

Debugging in vscode, the biggest difference I notice is that __beartype_wrapper==True when type annotations are included

image

while when correctly functioning and no type annotations

image

I don't fully understand what beartype_this_package() is doing to make

inspect.isgeneratorfunction(f)

incorrectly say the function is not a generator when it should be!

pablovela5620 commented 3 weeks ago

I also want to add if I add type hints to variable assignments does work as expected with beartype and causes no issues when yielding. It sorta lets make hack around things, but its not the best

def yield_message_succeeds(message):
    message: bool = message
    final_message = ""
    for char in message:
        time.sleep(0.05)
        final_message += char
        yield final_message

correctly produces this

beartype.roar.BeartypeDoorHintViolation: Callable beartype_gradio.gradio_ui.yield_message_succeeds() local variable "message" value 'This is a yield test!' violates type hint <class 'bool'>, as str 'This is a yield test!' not instance of bool.
^CKeyboard interruption in main thread... closing server.

and this causes no errors and runs great!

def yield_message_succeeds(message):
    message: str = message
    final_message = ""
    for char in message:
        time.sleep(0.05)
        final_message += char
        yield final_message
leycec commented 3 weeks ago

Wowza! Thanks so much for both the detailed writeup and the beartype-gradio repository. Indeed, your dark suspicions are correct: @beartype does not currently preserve "generator function"-ness from the low-level perspective of inspect.isgeneratorfunction(f). @beartype does, of course, preserve "generator function"-ness from the high-level perspective of "This function still looks and feels like a generator."

That is to say, generator functions do work perfectly well with @beartype; they just don't preserve the "hacky special sauce" that inspect.isgeneratorfunction(f) and therefore Gradio are looking for. What is this "hacky special sauce" I speak of? Well, it turns out that there's a low-level inspect.CO_ASYNC_GENERATOR bit flag in the code objects associated with generator functions. The inspect.isgeneratorfunction(f) tester function and therefore Gradio wants this bit flag to be set. @beartype currently doesn't set this bit flag, even though generator functions type-checked by @beartype otherwise behave as expected. Then...

Why Doesn't @beartype Just Set This Bit Flag?

Three reasons:

Unfortunately, you're the first @beartype user to hit this. Thankfully, there's probably an "easy" way for @beartype to do this for certain definitions of "easy".

To understand why this is happening in the first place, let's print the type-checking code that @beartype is dynamically generating for your yield_message_fails() function. We can trivially do this by passing the conf=BeartypeConf(is_debug) parameter to the @beartype decorator. So, let's do this!

# This...
from beartype import beartype, BeartypeConf
...

@beartype(conf=BeartypeConf(is_debug=True))
def yield_message_fails(message: str) -> Generator[str, None, None]:
    final_mesage = ""
    for char in message:
        time.sleep(1)
        final_mesage += char
        yield final_mesage

...prints out this:

(line 0001) def yield_message_fails(
(line 0002)     *args,
(line 0003)     __beartype_get_violation=__beartype_get_violation, # is <function get_func_pith_violation at 0x7f08a3cfea20>
(line 0004)     __beartype_conf=__beartype_conf, # is "BeartypeConf(is_debug=True)"
(line 0005)     __beartype_object_94081894789888=__beartype_object_94081894789888, # is <class 'collections.abc.Generator'>
(line 0006)     __beartype_check_meta=__beartype_check_meta, # is <beartype._check.metadata.metacheck.BeartypeCheckMeta object at 0x7f08a3eaebc0>
(line 0007)     __beartype_func=__beartype_func, # is <function yield_message_fails at 0x7f08a3c3fd80>
(line 0008)     **kwargs
(line 0009) ):
(line 0010)     # Localize the number of passed positional arguments for efficiency.
(line 0011)     __beartype_args_len = len(args)
(line 0012)     # Localize this positional or keyword parameter if passed *OR* to the
(line 0013)     # sentinel "__beartype_raise_exception" guaranteed to never be passed.
(line 0014)     __beartype_pith_0 = (
(line 0015)         args[0] if __beartype_args_len > 0 else
(line 0016)         kwargs.get('message', __beartype_get_violation)
(line 0017)     )
(line 0018) 
(line 0019)     # If this parameter was passed...
(line 0020)     if __beartype_pith_0 is not __beartype_get_violation:
(line 0021)         # Type-check this parameter or return against this type hint.
(line 0022)         if not isinstance(__beartype_pith_0, str):
(line 0023)             __beartype_violation = __beartype_get_violation(
(line 0024)                 check_meta=__beartype_check_meta,
(line 0025)                 pith_name='message',
(line 0026)                 pith_value=__beartype_pith_0,
(line 0027)             )
(line 0028) 
(line 0029)             raise __beartype_violation
(line 0030)     # Call this function with all passed parameters and localize the value
(line 0031)     # returned from this call.
(line 0032)     __beartype_pith_0 = __beartype_func(*args, **kwargs)
(line 0033) 
(line 0034)     # Noop required to artificially increase indentation level. Note that
(line 0035)     # CPython implicitly optimizes this conditional away. Isn't that nice?
(line 0036)     if True:
(line 0037)         # Type-check this parameter or return against this type hint.
(line 0038)         if not isinstance(__beartype_pith_0, __beartype_object_94081894789888):
(line 0039)             __beartype_violation = __beartype_get_violation(
(line 0040)                 check_meta=__beartype_check_meta,
(line 0041)                 pith_name='return',
(line 0042)                 pith_value=__beartype_pith_0,
(line 0043)             )
(line 0044) 
(line 0045)             raise __beartype_violation
(line 0046)     return __beartype_pith_0

The very last line is where things are breaking down for you and Gradio. @beartype currently prefers the standard return idiom of return __beartype_pith_0. For a function to be a generator function, however, that very last line instead needs to resemble:

(line 0046)     yield __beartype_pith_0

So, yield instead of return. And... we're all exhausted. :weary:

pablovela5620 commented 3 weeks ago

This was really helpful! Thank you for giving me a deeper understanding as to why things aren't working as expected. It sounds like for now I'll need to wait for beartype to use yield __beartype_pith_0 instead of return __beartype_pith_0 and continue just hacking the function input types if I want to use beartype with Gradio. Thank you again for the in-depth explanation, many wouldn't bother giving the full context

leycec commented 3 weeks ago

Thanks for being so understanding, @pablovela5620. I love chatting about the deep magical internals of typing, @beartype, and Python. I'd also like to add at this critical juncture that you have such a fascinating GitHub avatar. Computer vision looks so fun! :smile:

Oh – and I have yet another workaround for you. If you'd like, this is probably the least hacky approach. Lightly refactor your public yield_message_fails() generator function (which is not type-checked) to yield from a private _yield_message_fails() generator function (which is type-checked). I've verified on my end that this behaves as expected for you:

import gradio as gr
import time
from beartype import beartype
from typing import Generator

def yield_message_succeeds(message):
    final_message = ""
    for char in message:
        time.sleep(0.2)
        final_message += char
        yield final_message

def yield_message_fails(message):  # <-- intentionally unannotated, because gradio hates @beartype
    yield from _yield_message_fails(message)  # <-- magic happens

@beartype  # <-- this is the real deal. too bad gradio hates @beartype, huh? :o
def _yield_message_fails(message: str) -> Generator[str, None, None]:
    final_mesage = ""
    for char in message:
        time.sleep(1)
        final_mesage += char
        yield final_mesage

with gr.Blocks() as demo:
    message = gr.Textbox(label="Message", value="This is a yield test!")
    output = gr.Textbox(label="Output")
    output_failed = gr.Textbox(label="Output Failed")
    with gr.Row():
        btn_succeeds = gr.Button(value="Succeeds")
        btn_fails = gr.Button(value="Fails")
    btn_succeeds.click(fn=yield_message_succeeds, inputs=[message], outputs=[output])
    btn_fails.click(fn=yield_message_fails, inputs=[message], outputs=[output_failed])

demo.launch()

That... looks good to me! There's no hackiness there, really. Just a bit of boilerplate for which I apologize. I'm so sorry you had to endure this madness. I'm also committed to resolving this for you and all the other awesome Gradio users in early September.

@beartype personally thanks you for this fascinating issue. Go go, Computer Vision! :eyeglasses: :robot: :eyeglasses:

pablovela5620 commented 3 weeks ago

This worked perfectly! Much better than the hack I was using. A little bit of extra boilerplate never hurt anyone, thank you for the help. The combination of beartype + jaxtyping has helped me destroy so many silly bugs I used to have, your efforts on this library are very much appreciated!

leycec commented 2 weeks ago

Awww! Thanks so much for those super-kind words. You're wonderful. I'm delighted to be of service in your eternal quest to destroy all the bugs. I too share this impossible dream. :smile_cat:

ML and data science is sure rough stuff, huh? @beartype and jaxtyping are here to smooth out the bumps on your life's journey. You will give AI a pair of eyeballs. I know it! You got this.