API call not stack with the queue from the website.

xiaol commented 2 years ago

Describe the bug

I have a website about 5k DAU ,and api call from another service through api doc , i can see from console these api from another service jump queue and process immediately.

Is this supposed behavior or a bug?how can i fix this ?

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

call api when there is a queue in website : https://ai-creator.net/aiart ,

I am hosting stable diffusion webui, i am not so sure how can you debug online.

Screenshot

No response

Logs

...

you can see there are two jobs ,but i have only set concurrency_count=1 in my code.

System Info

>>> import gradio as gr
>>> gr.__version__
'3.4b3'
>>>

Severity

serious, but I can work around it

abidlabs commented 2 years ago

Hi @xiaol can you please provide a link your website and/or code to your demo so that we can try to reproduce this issue?

And one question -- have you enabled the queue for your demo? This should not happen if you have enabled the queue

xiaol commented 2 years ago

sorry for this absurd information , website is here https://ai-creator.net/aiart ,

and this is the launch code:

  def run(self):
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
        gradio_params = {
            'inbrowser': opt.inbrowser,
            'server_name': '0.0.0.0',
            'server_port': opt.port,
            'share': opt.share,
            'show_error': True,
            'debug':True
        }
        if not opt.share:
            demo.queue(concurrency_count=1, client_position_to_load_data=30)
        if opt.share and opt.share_password:
            gradio_params['auth'] = ('webui', opt.share_password)
        # gradio_params['auth'] = same_auth

        # Check to see if Port 7860 is open
        port_status = 1
        while port_status != 0:
            try:
                self.demo.launch(**gradio_params)
            except (OSError) as e:
                print (f'Error: Port: {opt.port} is not open yet. Please wait, this may take upwards of 60 seconds...')
                time.sleep(10)
            else:
                port_status = 0

    def stop(self):
        self.demo.close() # this tends to hang

def launch_server():
    server_thread = ServerLauncher(demo)
    server_thread.start()

    try:
        while server_thread.is_alive():
            time.sleep(60)
    except (KeyboardInterrupt, OSError) as e:
        crash(e, 'Shutting down...')

this is the txt2img button code:

use_queue = True

                txt2img_btn.click(
                    txt2img_func,
                    txt2img_inputs,
                    txt2img_outputs,
                    api_name='txt2img',
                    queue=use_queue
                )

abidlabs commented 2 years ago

Thanks @xiaol, one more clarification -- when you say "api call from another service through api doc", do you mean they make a POST request to the /api/predict endpoint? Would you like to disable this behavior or would like to preserve this behavior but ensure users join the queue?

xiaol commented 2 years ago

Yes, make a POST request to /api/predict endpoint , and this would slow the online users' queue, it gives a privilege to api reuquest. For some api it would be fine, but some should queue with the website users since everyone can make api request. Of course , it makes online service vulnerable, but i haven't firgure out how to deal with it.

Because i wanna to charge and limit the api ,maybe gives them privilege is fine for now.

But after all, i wanna to control the behavior to give different api or users privilege to jump the queue, if i monetize this product.

And from what i observed , users always join the queue, except you use a phone browser and switch the browser to backend ,would stuck your queue, maybe i should post another issue about this problem.

Thank you for your patience for my bad writing skill.

freddyaboulton commented 2 years ago

Hi @xiaol !

Yes it’s true that the backend api is open when the queue is enabled, allowing users who use the rest api directly to skip the ones in the queue.

One thing that I think we can implement in Gradio is to block all requests to the /api/ end point by default if the queue for that particular route is enabled. We can add an ‘open_routes’ parameter to the queue method so that ‘queue(open_routes=True)’ means the route is not blocked when the queue is enabled (the current behavior).

That should let you control whether you want to prevent api users to skip the queue or not.

I’m not sure if the “charging for api” usage feature you talk about should live in Gradio. But I think you can implement your own rate limiting functionality as a FastApi middleware https://fastapi.tiangolo.com/tutorial/middleware/

xiaol commented 2 years ago

I see, a switcher of api is fine.Middleware is a good practice ,great to know ,i'll work around it.

gradio-app / gradio