php / php-src

The PHP Interpreter
https://www.php.net
Other
38.22k stars 7.75k forks source link

curl send request twice #12450

Closed dmitryvakulenko closed 9 months ago

dmitryvakulenko commented 1 year ago

Description

When curl send two requests and second one closing by timeout, curl reconnect and send it one more times. This happend only if CURLOPT_CUSTOMREQUEST was used.

Client code:

<?php

declare(strict_types=1);

$ch = curl_init();

curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");

curl_setopt($ch, CURLOPT_URL, "http://localhost");
curl_exec($ch);

curl_setopt($ch, CURLOPT_URL, "http://localhost");
curl_exec($ch);

curl_close($ch);

Server code written on Go 1.21:

package main

import (
    "log"
    "net/http"
    "time"
)

type Handler struct {
    amount int
}

func (p *Handler) ServeHTTP(respw http.ResponseWriter, req *http.Request) {
    log.Printf("Got %s request %v from %s\n", req.Method, req.URL.String(), req.RemoteAddr)

    p.amount += 1

    if p.amount == 2 {
        time.Sleep(time.Second * 5)
    }

    return
}

func main() {
    h := Handler{}
    srv := http.Server{
        Addr:         ":80",
        Handler:      &h,
        ReadTimeout:  2 * time.Second,
        WriteTimeout: 2 * time.Second,
    }

    srv.ListenAndServe()
}

Resulted in this output:

2023/10/16 14:04:19 Got POST request / from 172.18.0.12:35374
2023/10/16 14:04:19 Got POST request / from 172.18.0.12:35374
2023/10/16 14:04:24 Got POST request / from 172.18.0.12:47144

But I expected this output instead:

2023/10/16 14:04:19 Got POST request / from 172.18.0.12:35374
2023/10/16 14:04:19 Got POST request / from 172.18.0.12:35374

PHP Version

php 7.4-8.1

Operating System

Ubuntu 20.04, Alpine 3.11.3

SakiTakamachi commented 1 year ago

@dmitryvakulenko I tried it with php8.1.22, but it does not reproduce but reproduced in 7.4. Is it reproduced even using the latest 8.1?

dmitryvakulenko commented 1 year ago

@SakiTakamachi, yes. I reproduced it in 8.1.19.

SakiTakamachi commented 1 year ago

What's your version of curl?

curl -V
SakiTakamachi commented 1 year ago

The php side does not do detailed things such as retry control, so this may be a libcurl issue.

I wrote that the problem was reproduced in 7.4, but it was possible that the curl version was simply different.

dmitryvakulenko commented 1 year ago

In php 7.4 container: curl 7.67.0 (x86_64-alpine-linux-musl) libcurl/7.79.1 OpenSSL/1.1.1l zlib/1.2.11 nghttp2/1.40.0

In php 8.1: curl 8.1.1 (x86_64-alpine-linux-musl) libcurl/8.1.1 OpenSSL/3.1.0 zlib/1.2.13 brotli/1.0.9 libidn2/2.3.4 nghttp2/1.53.0

SakiTakamachi commented 1 year ago

Hmm. Considering the version of my environment, it may not simply be a curl version issue.

curl 7.81.0 (x86_64-pc-linux-gnu) libcurl/7.81.0

I'd like to take a closer look

dmitryvakulenko commented 1 year ago

Thank you!

SakiTakamachi commented 1 year ago

No matter what I try, I can't get it to work in the local environment.

The only thing I was able to reproduce was a test on a cheap rental server. The curl it has is a very old version.

If possible, can you update alpine's curl and curl-dev versions to the latest? Updating may resolve the issue. No matter how much I read the php source code, it doesn't seem like php is involved in the connection retry behavior.

dmitryvakulenko commented 1 year ago

@SakiTakamachi, sorry for delay.

I updated my container's curl to version 7.79.1, but the bug still reproduce. Also I try to repeat this in my main OS Linux Mint 21.2 Victoria that contain libcurl version 7.81.0-1ubuntu1.14. The bug are still reproduced.

SakiTakamachi commented 11 months ago

Sorry, I am late.

No matter what I do, I can't reproduce it in my environment.... Is it possible to write and test code similar to php in Go?

dmitryvakulenko commented 11 months ago

Ok, I'll try.

SakiTakamachi commented 11 months ago

Thank you! If you get the same result in Go, you might think that this is not a PHP-specific problem. (Personally, I don't think this problem is caused by PHP, since retries are not controlled on the PHP side in the first place.)

dmitryvakulenko commented 10 months ago

Finally tried to reproduce the bug in Go with this code

package main

import (
    "fmt"
    "github.com/andelf/go-curl"
)

func main() {
    easy := curl.EasyInit()
    defer easy.Cleanup()

    easy.Setopt(curl.OPT_HEADER, false)
    easy.Setopt(curl.OPT_CUSTOMREQUEST, "POST")
    easy.Setopt(curl.OPT_PORT, 8080)
    easy.Setopt(curl.OPT_URL, "http://localhost")
    if err := easy.Perform(); err != nil {
        fmt.Printf("ERROR: %v\n", err)
    }
}

using bingings github.com/andelf/go-curl.

And - nothing. I got a single request on server side as expected.

SakiTakamachi commented 10 months ago

@dmitryvakulenko Thank you for verifying! Okay, I'll check that again.

divinity76 commented 9 months ago

wow! i can actually reproduce this! the following dockerfile reproduce the issue, doing

sudo docker build .

in a folder with the Dockerfile

FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive 
RUN apt-get update -y
RUN apt-get -y full-upgrade;
RUN apt-get -y install apt-utils
RUN apt-get -y install php-cli php-curl golang curl
RUN echo '<?php\n\
\n\
declare(strict_types=1);\n\
\n\
$ch = curl_init();\n\
\n\
curl_setopt($ch, CURLOPT_HEADER, false);\n\
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");\n\
\n\
curl_setopt($ch, CURLOPT_URL, "http://localhost");\n\
curl_exec($ch);\n\
\n\
curl_setopt($ch, CURLOPT_URL, "http://localhost");\n\
curl_exec($ch);\n\
\n\
curl_close($ch);\n\
' > /tmp/reproduce.php;
RUN echo 'package main\n\
\n\
import (\n\
    "log"\n\
    "net/http"\n\
    "time"\n\
)\n\
\n\
type Handler struct {\n\
    amount int\n\
}\n\
\n\
func (p *Handler) ServeHTTP(respw http.ResponseWriter, req *http.Request) {\n\
    log.Printf("Got %s request %v from %s\\n", req.Method, req.URL.String(), req.RemoteAddr)\n\
\n\
    p.amount += 1\n\
\n\
    if p.amount == 2 {\n\
        time.Sleep(time.Second * 5)\n\
    }\n\
\n\
    return\n\
}\n\
\n\
func main() {\n\
    h := Handler{}\n\
    srv := http.Server{\n\
        Addr:         ":80",\n\
        Handler:      &h,\n\
        ReadTimeout:  2 * time.Second,\n\
        WriteTimeout: 2 * time.Second,\n\
    }\n\
\n\
    srv.ListenAndServe()\n\
}\n\
' > /tmp/main.go;
RUN bash -c '\
php -v; \
php -i | grep -i curl ; \
go run /tmp/main.go & sleep 1; \
php /tmp/reproduce.php;\
false; \
';
# the last false; prevents caching of the last RUN.

ends with:

Step 9/9 : RUN bash -c 'php -v; php -i | grep -i curl ; go run /tmp/main.go & sleep 1; php /tmp/reproduce.php;false; ';
 ---> Running in 7f3500777ad6
PHP 7.4.3-4ubuntu2.19 (cli) (built: Jun 27 2023 15:49:59) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
    with Zend OPcache v7.4.3-4ubuntu2.19, Copyright (c), by Zend Technologies
/etc/php/7.4/cli/conf.d/20-curl.ini,
curl
cURL support => enabled
cURL Information => 7.68.0
curl.cainfo => no value => no value
2024/02/08 12:37:02 Got POST request / from 127.0.0.1:33400
2024/02/08 12:37:02 Got POST request / from 127.0.0.1:33400
2024/02/08 12:37:07 Got POST request / from 127.0.0.1:50024
The command '/bin/sh -c bash -c 'php -v; php -i | grep -i curl ; go run /tmp/main.go & sleep 1; php /tmp/reproduce.php;false; ';' returned a non-zero code: 1

@SakiTakamachi try reproducing it with the dockerfile?

divinity76 commented 9 months ago

holy shit! i can even reproduce it on current git master! (php 8.4.0-dev commit cd66fcc68be96408423e49ae01de5719c02c86d1 ) changing the Dockerfile to

FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive 
RUN apt-get update -y
RUN apt-get -y full-upgrade;
RUN apt-get -y install apt-utils
RUN apt-get -y install php-cli php-curl golang curl build-essential git libcurl4-openssl-dev autoconf bison re2c
RUN bash -c '\
git clone '\''https://github.com/php/php-src.git'\'' /tmp/php-src --depth 1; \
cd /tmp/php-src; \
./buildconf --force; \
./configure --prefix=/usr --disable-all --disable-cgi --enable-cli --with-curl; \
make -j$(nproc); \
make install; \
php -v; \
'
RUN echo '<?php\n\
\n\
declare(strict_types=1);\n\
\n\
$ch = curl_init();\n\
\n\
curl_setopt($ch, CURLOPT_HEADER, false);\n\
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");\n\
\n\
curl_setopt($ch, CURLOPT_URL, "http://localhost");\n\
curl_exec($ch);\n\
\n\
curl_setopt($ch, CURLOPT_URL, "http://localhost");\n\
curl_exec($ch);\n\
\n\
curl_close($ch);\n\
' > /tmp/reproduce.php;
RUN echo 'package main\n\
\n\
import (\n\
    "log"\n\
    "net/http"\n\
    "time"\n\
)\n\
\n\
type Handler struct {\n\
    amount int\n\
}\n\
\n\
func (p *Handler) ServeHTTP(respw http.ResponseWriter, req *http.Request) {\n\
    log.Printf("Got %s request %v from %s\\n", req.Method, req.URL.String(), req.RemoteAddr)\n\
\n\
    p.amount += 1\n\
\n\
    if p.amount == 2 {\n\
        time.Sleep(time.Second * 5)\n\
    }\n\
\n\
    return\n\
}\n\
\n\
func main() {\n\
    h := Handler{}\n\
    srv := http.Server{\n\
        Addr:         ":80",\n\
        Handler:      &h,\n\
        ReadTimeout:  2 * time.Second,\n\
        WriteTimeout: 2 * time.Second,\n\
    }\n\
\n\
    srv.ListenAndServe()\n\
}\n\
' > /tmp/main.go;
RUN bash -c '\
php -v; \
php -i | grep -i curl ; \
go run /tmp/main.go & sleep 1; \
php /tmp/reproduce.php;\
false; \
';
# the last false; prevents caching of the last RUN.

changes the last output to

PHP 8.4.0-dev (cli) (built: Feb  8 2024 12:48:47) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.4.0-dev, Copyright (c) Zend Technologies
Configure Command =>  './configure'  '--prefix=/usr' '--disable-all' '--disable-cgi' '--enable-cli' '--with-curl'
curl
cURL support => enabled
cURL Information => 7.68.0
curl.cainfo => no value => no value
2024/02/08 12:49:07 Got POST request / from 127.0.0.1:38060
2024/02/08 12:49:07 Got POST request / from 127.0.0.1:38060
2024/02/08 12:49:12 Got POST request / from 127.0.0.1:38066
The command '/bin/sh -c bash -c 'php -v; php -i | grep -i curl ; go run /tmp/main.go & sleep 1; php /tmp/reproduce.php;false; ';' returned a non-zero code: 1

this issue has existed since at least PHP7.4.3 to current master!

divinity76 commented 9 months ago

it's not a libcurl-version issue either, still happens when building against the newest libcurl version 8.6.0, released 8 days ago (2024-01-31):

FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive 
RUN apt-get update -y
RUN apt-get -y full-upgrade;
RUN apt-get -y install apt-utils
RUN apt-get -y install golang build-essential git autoconf bison re2c make cmake automake libtool libpsl-dev libpsl5
RUN bash -c 'set -e;\
git clone -b curl-8_6_0 --single-branch --depth 1 https://github.com/curl/curl.git; \
cd curl; \
autoreconf -fi; \
echo ./buildconf; \
./configure --without-ssl --without-libpsl ; \
make -j$(nproc); \
make install\
'
RUN bash -c 'set -e;\
git clone '\''https://github.com/php/php-src.git'\'' /tmp/php-src --single-branch --depth 1; \
cd /tmp/php-src; \
./buildconf --force; \
./configure --disable-all --disable-cgi --enable-cli --with-curl;\
make -j$(nproc); \
make install; \
php -v; \
'
RUN echo '<?php\n\
\n\
declare(strict_types=1);\n\
\n\
$ch = curl_init();\n\
\n\
curl_setopt($ch, CURLOPT_HEADER, false);\n\
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");\n\
\n\
curl_setopt($ch, CURLOPT_URL, "http://localhost");\n\
curl_exec($ch);\n\
\n\
curl_setopt($ch, CURLOPT_URL, "http://localhost");\n\
curl_exec($ch);\n\
\n\
curl_close($ch);\n\
' > /tmp/reproduce.php;
RUN echo 'package main\n\
\n\
import (\n\
    "log"\n\
    "net/http"\n\
    "time"\n\
)\n\
\n\
type Handler struct {\n\
    amount int\n\
}\n\
\n\
func (p *Handler) ServeHTTP(respw http.ResponseWriter, req *http.Request) {\n\
    log.Printf("Got %s request %v from %s\\n", req.Method, req.URL.String(), req.RemoteAddr)\n\
\n\
    p.amount += 1\n\
\n\
    if p.amount == 2 {\n\
        time.Sleep(time.Second * 5)\n\
    }\n\
\n\
    return\n\
}\n\
\n\
func main() {\n\
    h := Handler{}\n\
    srv := http.Server{\n\
        Addr:         ":80",\n\
        Handler:      &h,\n\
        ReadTimeout:  2 * time.Second,\n\
        WriteTimeout: 2 * time.Second,\n\
    }\n\
\n\
    srv.ListenAndServe()\n\
}\n\
' > /tmp/main.go;
RUN bash -c 'set -e;\
php -v; \
php -i | grep -i curl ; \
go run /tmp/main.go & sleep 1; \
php /tmp/reproduce.php;\
false; \
';
# the last false; prevents caching of the last RUN.

results in:

PHP 8.4.0-dev (cli) (built: Feb  8 2024 14:19:44) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.4.0-dev, Copyright (c) Zend Technologies
Configure Command =>  './configure'  '--disable-all' '--disable-cgi' '--enable-cli' '--with-curl'
curl
cURL support => enabled
cURL Information => 8.6.0-DEV
curl.cainfo => no value => no value
2024/02/08 14:20:04 Got POST request / from 127.0.0.1:60390
2024/02/08 14:20:04 Got POST request / from 127.0.0.1:60390
2024/02/08 14:20:09 Got POST request / from 127.0.0.1:60392
ranvis commented 9 months ago

@dmitryvakulenko The client code in Go you have provided in https://github.com/php/php-src/issues/12450#issuecomment-1880794687 is not equivalent to PHP one. It only tries the request once unlike the PHP client code doing twice. Would you try with the code that performs the same operation?

And, I think the third request in question is rather expected. The second request is sent through the connection curl had opened on the first request, then the server closes the connection without responding anything but FIN because of the "too short" WriteTimeout.

I observed the third request even with the proper POST request which is obviously an unsafe method, on an old curl. But this should be a matter of curl's implementation, and I'm not sure if the behavior actually violates RFC 7230 6.3.1 considering the server side behavior.

This happend only if CURLOPT_CUSTOMREQUEST was used.

FWIW, I cannot reproduce (again, I'm using an old version, but it is not the matter of this comment). It also happens without the option on my env.

SakiTakamachi commented 9 months ago

@divinity76 Great! I was able to reproduce it too. Then I investigated further. To clarify whether it's a php issue, I created a minimal repro code in C.

curl.c:

#include <curl/curl.h>
#include <curl/easy.h>

int main(void) {
    CURL *curl = curl_easy_init();
    curl_easy_setopt(curl, CURLOPT_HEADER, 0);
    curl_easy_setopt(curl, CURLOPT_CUSTOMREQUEST, "POST");
    curl_easy_setopt(curl, CURLOPT_URL, "http://php.curl"); // my env
    curl_easy_perform(curl);

    curl_easy_setopt(curl, CURLOPT_URL, "http://php.curl"); // my env
    curl_easy_perform(curl);

    curl_easy_cleanup(curl);

    return 0;
}

compile:

gcc -Wall -o curl curl.c -lcurl

exec:

./curl

result:

# go run /mount/main.go
2024/02/12 02:03:23 Got POST request / from 172.19.0.2:53666
2024/02/12 02:03:23 Got POST request / from 172.19.0.2:53666
2024/02/12 02:03:28 Got POST request / from 172.19.0.2:56386

From these things, I found out that the phenomenon itself can be reproduced even if we are not using php. But I'm not sure if the cleanup is done properly.

SakiTakamachi commented 9 months ago

From reading the following description, it looks like curl_easy_perform() can be executed multiple times without any cleanup in between. Therefore, I think this is a problem on the curl side.

You can do any amount of calls to curl_easy_perform while using the same easy_handle. If you intend to transfer more than one file, you are even encouraged to do so. libcurl attempts to reuse existing connections for the following transfers, thus making the operations faster, less CPU intense and using less network resources. You probably want to use curl_easy_setopt between the invokes to set options for the following curl_easy_perform call.

https://curl.se/libcurl/c/curl_easy_perform.html

divinity76 commented 9 months ago

since it's a curl thing and not a php-thing, i suggest we close this issue. and @dmitryvakulenko can perhaps raise an issue at https://github.com/curl/curl/issues

SakiTakamachi commented 9 months ago

I've closed this because it's not a php issue.