simultaneous requests - Githubissues

Phillipip commented 4 years ago

Many thanks for the great development! We would like to use Caddy as cache server for HLS streaming. Here we cache the playlist with only one second. However, if many users open the stream at the same time (100 users and more) the requests are no longer correctly intercepted and many of the requests end up on the main server. Is it possible to prevent simultaneous parallel requests (for the same file) to the main server?

kind regards paul

sillygod commented 4 years ago

Hi Phillipip, I am glad you like it. BTW, could you please provide an example to help me debug this issue? Furthermore, give me some more context something like the request's header.

I've tested this project with normal requests ( not streaming ) and it works without the issue you said. I guess this only happens on the streaming request. Maybe, I need to do some extra logic to handle the streaming request. :)

schlypel commented 4 years ago

Hi, I'm a Friend of @Phillipip and will try to document the hls problem tomorrow. I have been able to get a few more clues. It seems, that to many requests, that are very simultaneous, are being requested in parallel. Also it occurs more often, if the CPU of the host, that is running caddy, is under more load. Thanks for this plugin :)

sillygod commented 4 years ago

hi @schlypel ,

Thanks for helping! Currently, I still don't know how to reproduce this one.

schlypel commented 4 years ago

can i contact you via your email address (the one given on your github profile)?

sillygod commented 4 years ago

can i contact you via your email address (the one given on your github profile)?

Ok, sure.

tystuyfzand commented 4 years ago

The issue is likely that it's pretty much forcing a race condition. If you send enough requests at the same time before a request is cached, it won't hit the cache and will instead go straight to the source. So, say it takes 2 seconds for the origin to respond, it will probably send all requests until then to the origin, as it doesn't have a cached response (it doesn't count in-progress as cached, obviously).

The best, and maybe only way to fix this would be to have a lock on the request URI that is taken when a direct origin request is done, then released and the other requests can go through once the request finishes, but that doesn't really fit the use case of this caching module.

sillygod commented 4 years ago

Hi @tystuyfzand ,

I don't think the situation you mentioned would happen. You can see here https://github.com/sillygod/cdp-cache/blob/master/handler.go#L283. I've acquired a lock on the request URI.

Furthermore, I also write a simple example server for tests.

from fastapi import FastAPI
import asyncio
import time

app = FastAPI()

@app.get("/name")
async def tt():
    await asyncio.sleep(5)
    return "OK"

I make the caddy-cache server to proxy this and it works well.

So it is helpful if someone can provide an example to reproduce this bug.

darkweak commented 3 years ago

Hi everyone, first thanks for your work on this plugin. To handle simultaneous requests and to support request coalescing you can check my cache system that is already supporting that on this directory : https://github.com/Darkweak/Souin/tree/master/cache/coalescing

sillygod / cdp-cache

simultaneous requests #7