gquintard / vmod-reqwest

BSD 3-Clause "New" or "Revised" License
9 stars 7 forks source link

'tokio-runtime-worker' panicked at src/lib.rs:804:9 #10

Closed gecon closed 4 months ago

gecon commented 4 months ago

Getting several errors in syslog as shown below:

 varnishd[3997540]: Child (3997553) said thread 'tokio-runtime-worker' panicked at src/lib.rs:804:9:
 varnishd[3997540]: Child (3997553) said called `Result::unwrap()` on an `Err` value: SendError { .. }

This is on Varnish 7.3.0 using vmod-reqwest 0.0.9.

Are those expected under any circumstances?

Thank you.

gquintard commented 4 months ago

Hi, than kyou for reporting this!

that's not expected, it looks like the channel got broken, which could be because the task timed out and disappeared, maybe?

A few questions:

cheers,

gecon commented 4 months ago

Thank you so much for your fast response.

Does it happen often?

For example, it happened about 24 times in the last 60 minutes.

# Last 60 minutes:
Jul  1 21:05:20 
Jul  1 21:06:54 
Jul  1 21:09:22 
Jul  1 21:11:54 
Jul  1 21:13:19 
Jul  1 21:14:53 
Jul  1 21:15:54 
Jul  1 21:22:24 
Jul  1 21:23:21 
Jul  1 21:28:33 
Jul  1 21:30:24 
Jul  1 21:30:44 
Jul  1 21:31:12 
Jul  1 21:32:34 
Jul  1 21:33:04 
Jul  1 21:38:14 
Jul  1 21:39:18 
Jul  1 21:39:40 
Jul  1 21:41:08 
Jul  1 21:42:06 
Jul  1 21:42:19 
Jul  1 21:43:03 
Jul  1 21:46:10 
Jul  1 21:56:58 

is there any chance the crashes are related to a VCL reload or other admin tasks?

No.

can you share the VCL you are using?

Unfortunately no, only parts of it.

what kind of backend is vmod_reqwest talking too?

Well, I am sending some PURGE requests to the same (self) Varnish instance.

Here is some VCL code.


sub vcl_init {
    new client = reqwest.client();
}

sub vcl_backend_response {
...
   if (beresp.http.SOMETHING) {
        client.init("async", "http://127.0.0.1:80", "PURGE"); # http://127.0.0.1:80 is self (the same Varnish instance)
        client.set_header(name="async",key="SOMETHING", value=beresp.http.SOMETHING);
        client.send("async");
   }
...

Thank you.
gquintard commented 4 months ago

thanks, I'll try to reproduce that. It could be that the backend task finishes faster than the purge, so the core processor doesn't have anyone to send the response.

could you add that line after the client.send() please?

std.log(client.status("async"));

essentially we'll force the VCL to way for the response. It'll make the processing a tiny bit slower (a few milliseconds, I expect), but if it solves the issue, that's a clear indicator of where the problem lies

gecon commented 4 months ago

Indeed adding the client status logging fixed the issue. No such errors given after applying this. Will keep watching this for more hours.

I guess that using a "sync" fetch instead of "async", would have also fixed that, but would be a bit slower, right?

Thank you for your help.

gquintard commented 4 months ago

thanks, that is very useful

I guess that using a "sync" fetch instead of "async", would have also fixed that, but would be a bit slower, right?

this is actually exactly what the change you made did. "async" is just a name.

I'll try to open a PR this week

gquintard commented 4 months ago

@gecon , I just pushed https://github.com/gquintard/vmod-reqwest/commit/c0e3d35a11d85361680ef67bd7c6e848b1c285ad which according to my testing fixes the issue, would you mind giving it a try?

gquintard commented 4 months ago

actually, grab the latest commit from main, I caught a issue if varnish sends us empty buffers and fixed it

gecon commented 4 months ago

I use Varnish 7.3 so I applied your commits on Jul 3, 2024 on your v0.0.9 code and build the .so again. I removed the std.log(client.status("async")); line from the VCL..

The fix seems to work. There are no such errors afterwards.

Thank you very much for your prompt response and actions!

gquintard commented 4 months ago

excellent, thank you @gecon for the confirmation and opening the issue too)!