Closed rustatian closed 9 months ago
I've tested your example code using Apache and could download a megabyte of dashes without any problems. I've to set up a test bed using Nginx. This may take a few days. Either it is a bug in Nginx or a compatibility issue between Nginx and tokyo-fastcgi
.
Could you please test if setting fastcgi_buffering
to off
or setting fastcgi_buffers
to 128 8k
makes any difference?
I've tried this option previously, but, still there are a bunch of error responses:
Also, as you may see on the screenshot, I can have like 10 not completed responses, then 1-2-3 completed, and so on. And also, if you check the data with Wireshark/Tshark or similar tools, you'll see the broken (or not delivered properly) response:
I didn't tested on Apache, but, I tried PHP-FPM+NGINX (I'm not a PHP dev, just wanted to compare). And with PHP-FPM there are 0 errors. I event tried to load test it to see, if the errors come later, but no, everything is good on it.
I also guessed if that error appears only on my Mac, switched to my Arch laptop, bring up the same env from 0, and ehh, still saw the errors.
Hello @rustatian, I had some time to spare today and have set up a Nginx instance. I've used the example with a payload size of 1.048.576 bytes. I've run 4 curl instances in parallel in a loop, each fetching these 1 MiByte of dashes in parallel. Nothing. No errors, all output is delivered correctly.
I've used this - very bare bones - configuration for Nginx:
worker_processes 2;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
upstream fastcgi_backend {
server localhost:8080;
keepalive 8;
}
sendfile on;
keepalive_timeout 10;
server {
listen 80;
server_name localhost;
location / {
fastcgi_pass fastcgi_backend;
fastcgi_keep_conn on;
}
}
}
My tests are run with nginx version 1.24.0.
I'm using this modified example.rs
in case you want to try it out yourself:
use tokio::net::TcpListener;
use tokio_fastcgi::{Requests, RequestResult};
#[tokio::main]
async fn main() {
let addr = "127.0.0.1:8080";
let listener = TcpListener::bind(addr).await.unwrap();
loop {
let connection = listener.accept().await;
// Accept new connections
match connection {
Err(err) => {
println!("Establishing connection failed: {}", err);
break;
},
Ok((mut stream, address)) => {
println!("Connection from {}", address);
// If the socket connection was established successfully spawn a new task to handle
// the requests that the webserver will send us.
tokio::spawn(async move {
// Create a new requests handler it will collect the requests from the server and
// supply a streaming interface.
let mut requests = Requests::from_split_socket(stream.split(), 10, 10);
// Loop over the requests via the next method and process them.
while let Ok(Some(request)) = requests.next().await {
if let Err(err) = request.process(|request| async move {
println!("Request");
let mut stdout = request.get_stdout();
stdout.write(b"Content-Type: text/html\r\n").await.unwrap();
stdout.write(b"\r\n").await.unwrap();
let buf = vec![b'-'; 1024*1024];
stdout.write(&buf).await.unwrap();
RequestResult::Complete(0)
}).await {
// This is the error handler that is called if the process call returns an error.
println!("Processing request failed: {}", err);
}
}
});
}
}
}
}
Could you please post your Nginx version and configuration so I can more closely reproduce your setup?
Thanks for spending time on this 👍
I tried exactly your configuration and still no luck. I also tried to double check curl
with curl <url> -o file.txt
and on every request I have a file with a different size and curl
error.
To be sure, I have everything compiled with the latest rust
version (1.75.0), and latest nginx
version: 1.25.3
.
But, after the one restart, the strange thing happened. All my requests were completed w/o errors. After nginx
restart - errors returned 😢
Ok, I suspect that I found the issue. This issue is nginx
on Mac... I did re-test on the Arch and Ubuntu with the latest nginx version and... no errors. One more reason to wait for the Asahi Linux 😃
The only one strange thing, that FPM works w/o problems with nginx on Mac...
Hey @FlashSystems 👋
During these few days, I did some benchmarks/tests (amd64 platform). This is difficult to trace nor explain, but I definitely see complete hangs during the tests. I use k6
with a standard configuration, with different number of vus
(connected users).
For the test purposes, I tried to simulate that behavior with the Golang stdlib fcgi package, no luck, unfortunately. I also tried FPM with 1 static process, still no luck, 100% processed successfully.
Hello @rustatian, thank you for testing further. I've two questions:
.flush().unwrap();
to the stdout stream and test again?Hello @rustatian, I was now able to reproduce the problem. It seems to be a race condition somewhere because it only happens on high throughput and if much data is returned. I'll investigate.
I think I found the problem: Under high load, writes do not complete atomically any more. My code did not handle that case correctly, and the FastCGI protocol got desynchronized because there were not as many bytes written as stated within the message header. Then all subsequent writes on that connection failed.
Commit 1b3d0d2b fixes this behaviour and retries the partial write with the unwritten half of the buffer. In my tests I could handle 500 Parallel connections downloading 2 MiByte of dashes without errors. Above that limit, other problems (like not enough file handles) started to appear, but the FastCGI server was rock solid ;)
@rustatian: Could you please retest with the HEAD of the master
branch. I hope this solves your problem (even on Mac).
Cool, thanks @FlashSystems 👍 I'll retry my tests today and will let you know the results ⚡
@FlashSystems I did a quick test, no hangs or errors, everything is pretty good.
I still need to test our SAPI with the master
branch, but I'm in the middle of a big refactoring, need a couple of days to finish and do a final test.
Hey @FlashSystems 👋 Can't reproduce this behavior anymore. Loaded my server like never before, everything looks good now 👍
Fixes with Release 1.2.0.
Hey @FlashSystems 👋 Thanks for the library, good job 👍
I'm working on a SAPI PHP integration with fastCGI transport and found a kinda strange behavior when writing payloads with a body which is more than 512kb.
I'll show a simple example on how to reproduce this issue:
This needs to be pasted in the process callback (as in you
simple
example). But, if you change 800000 to 524288 (512kb) or less, everything will work w/o any problem.In the
nginx
debug log, I see the following error:I don't know, but I guess this might be because
finish
sometimes might be written before the actual payload chunks were written. But what is super strange, that everything works good with the payloads up to 512Kb.