manticoresoftware / manticoresearch-php

Official PHP client for Manticore Search
MIT License
160 stars 32 forks source link

Invalid HTTP request in some cases on replaceDocuments #207

Closed donhardman closed 2 months ago

donhardman commented 3 months ago

As we were updating the docs with recalculated embeddings in the Manticore GitHub search demo, we encountered an issue where replaceDocuments was producing an "Invalid HTTP method" error. We should investigate this and try to understand the reason behind it.

donhardman commented 3 months ago

The issue stems from the logic when we accumulate headers continuously and reuse curl connections set up previously. This means that all headers from the old connection merge into the new one. If we made an application/json request before and then a bulk request with application/x-ndjson, it results in two header lines being sent. However, the daemon parses only the first line, leading to the "Invalid HTTP method" error.

        $headers = $connection->getHeaders();
        $headers[] = sprintf('Content-Type: %s', $request->getContentType());

To fix this, we need to ensure that on long-running scripts, we don't infinitely extend headers with the same type. We should consider using a key-value map instead.

donhardman commented 3 months ago

I have discovered that while dumping curl info with a curl raw request by adding

curl_setopt($conn, CURLINFO_HEADER_OUT, true);
...
var_dump(curl_getopt($conn));

I discovered that we send proper headers, while still getting a 400 error from the daemon, which is weird:

{
  "url": "http://manticore:9308/bulk",
  "content_type": "application/json; charset=UTF-8",
  "http_code": 400,
  "header_size": 241,
  "request_size": 150,
  "filetime": -1,
  "ssl_verify_result": 0,
  "redirect_count": 0,
  "total_time": 0.063445,
  "namelookup_time": 0.000052,
  "connect_time": 0.000052,
  "pretransfer_time": 0.000067,
  "size_upload": 729799,
  "size_download": 31,
  "speed_download": 492,
  "speed_upload": 11584111,
  "download_content_length": 31,
  "upload_content_length": 729799,
  "starttransfer_time": 0.000068,
  "redirect_time": 0,
  "redirect_url": "",
  "primary_ip": "192.168.128.2",
  "certinfo": [],
  "primary_port": 9308,
  "local_ip": "",
  "local_port": -1,
  "http_version": 2,
  "protocol": 1,
  "ssl_verifyresult": 0,
  "scheme": "HTTP",
  "appconnect_time_us": 0,
  "connect_time_us": 52,
  "namelookup_time_us": 52,
  "pretransfer_time_us": 67,
  "redirect_time_us": 0,
  "starttransfer_time_us": 68,
  "total_time_us": 63445,
  "request_header": "POST /bulk HTTP/1.1\r\nHost: manticore:9308\r\nAccept: */*\r\nAccept-Encoding: deflate, gzip\r\nContent-Type: application/x-ndjson\r\nContent-Length: 729799\r\n\r\n",
  "effective_method": "POST"
}

I tried to write a script that reproduces it with dumped queries into a file:

#!/usr/bin/env bash

endpoint="http://manticore:9308"
requests_prefix="/tmp/data.request."

# Function to send a request and print the result
send_request() {
    local url=${endpoint}/bulk
    curl --silent --show-error --write-out "%{http_code}" \
        --data-binary @"$1" -H 'Content-type: application/x-ndjson' \
        --output /dev/null "${url}"
}

concurrency=$(ls "$requests_prefix"* | wc -l | xargs)
# Loop indefinitely
while true; do
    # Array to store PIDs of curl processes
    pids=()

    # Check if we've reached the maximum number of parallel requests
    while [ ${#pids[@]} -ge ${concurrency} ]; do
        # Wait for any child process to exit
        wait -n
        # Remove exited child process from the array
        pids=("${pids[@]}")
    done

    # Send the request in detached mode and store the PID
    for f in "$requests_prefix"*; do
        send_request "${f}" &
        pids+=($!)
    done

    # Wait for all child processes to exit
    for pid in "${pids[@]}"; do
        wait "${pid}"
    done
done

But unfortunately, it does not fail at all.

donhardman commented 3 months ago

After multiple test runs, it seems like the following line helps stabilize the issue:

curl_setopt($conn, CURLOPT_FRESH_CONNECT, true);
donhardman commented 2 months ago

As we discussed, we should implement PHP code that will attempt to replicate it.

donhardman commented 2 months ago

When running the following script, in some cases, EVEN when the request data is valid, the daemon returns an invalid HTTP request error.

It happens during concurrency.

Sometimes, there is one more error that appears – CURL error occurred: Received HTTP/0.9 when not allowed.

No clue about the reason, but this is what we should investigate.

200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
200 - OK
400 - ERR: {"url":"http:\/\/manticore:9308\/bulk","content_type":"application\/json; charset=UTF-8","http_code":400,"header_size":241,"request_size":180,"filetime":-1,"ssl_verify_result":0,"redirect_count":0,"total_time":0.011459,"namelookup_time":9.6e-5,"connect_time":9.6e-5,"pretransfer_time":0.000195,"size_upload":66,"size_download":25,"speed_download":2272,"speed_upload":6000,"download_content_length":25,"upload_content_length":66,"starttransfer_time":0.01143,"redirect_time":0,"redirect_url":"","primary_ip":"192.168.128.3","certinfo":[],"primary_port":9308,"local_ip":"","local_port":-1,"http_version":2,"protocol":1,"ssl_verifyresult":0,"scheme":"HTTP","appconnect_time_us":0,"connect_time_us":96,"namelookup_time_us":96,"pretransfer_time_us":195,"redirect_time_us":0,"starttransfer_time_us":11430,"total_time_us":11459,"request_header":"POST \/bulk HTTP\/1.1\r\nHost: manticore:9308\r\nAccept: *\/*\r\nContent-Type: application\/x-ndjson\r\nContent-Length: 66\r\n\r\n","effective_method":"POST"}

The table is simple:

create table test(title string);
And create a `run` script with `chmod +x run` after and put the contents with the changed host to manticore: ```php #!/usr/bin/env php 0) { $errorText = curl_error($ch); echo "CURL error occured: $errorText\n"; curl_close($ch); $ch = null; } return $responseCode === 200 ? 0 : 1; } ```

Once we run it with ./run, we can see that sometimes errors happen.

Example of usage and results:

$ ./run | grep '^400'
400 - ERR: {"url":"http:\/\/manticore:9308\/bulk","content_type":"application\/json; charset=UTF-8","http_code":400,"header_size":241,"request_size":180,"filetime":-1,"ssl_verify_result":0,"redirect_count":0,"total_time":0.00083,"namelookup_time":8.0e-5,"connect_time":8.0e-5,"pretransfer_time":0.000121,"size_upload":66,"size_download":25,"speed_download":25000,"speed_upload":66000,"download_content_length":25,"upload_content_length":66,"starttransfer_time":0.000804,"redirect_time":0,"redirect_url":"","primary_ip":"192.168.128.3","certinfo":[],"primary_port":9308,"local_ip":"","local_port":-1,"http_version":2,"protocol":1,"ssl_verifyresult":0,"scheme":"HTTP","appconnect_time_us":0,"connect_time_us":80,"namelookup_time_us":80,"pretransfer_time_us":121,"redirect_time_us":0,"starttransfer_time_us":804,"total_time_us":830,"request_header":"POST \/bulk HTTP\/1.1\r\nHost: manticore:9308\r\nAccept: *\/*\r\nContent-Type: application\/x-ndjson\r\nContent-Length: 66\r\n\r\n","effective_method":"POST"}
tomatolog commented 2 months ago

seems error in the script if I remove static $ch; and use local variable scope - all works fine

$ch = curl_init();

I not sure how static $ch; works with fork but from my point it has issue there as with local ch issue gone

tomatolog commented 2 months ago

found the issue report about the curl and php pcntl_fork https://github.com/php/php-src/issues/10633 that seems similar to this case. The discussion users think that curl shares state during the fork and this way affects the requests and responses

donhardman commented 2 months ago

I think it's a good idea to get rid of the static cURL instance and instead keep it per object instance:

https://github.com/manticoresoftware/manticoresearch-php/blob/master/src/Manticoresearch/Transport/Http.php#L27

Nick-S-2018 commented 2 months ago

@donhardman Do we really need to get rid of that? I mean those persistent connections were declared as one of the client's 'features'. If we don't want them to be used for a specific application, we can just disable them through the client's config.

donhardman commented 2 months ago

We should keep it, but in a proper way – use a non-static variable that sticks to the class. That way, when we don't create multiple instances of the Client, it will still work as expected. However, when we do create instances, we're probably doing it on purpose, so we expect not to keep the original state from some random instance used before and not in the current namespace (like a fork).

donhardman commented 2 months ago

The case here – the code with forking is implemented correctly.

I did not use a client created from the parent process, and instead created a new Client instance inside the forked process. However, since we have a static variable, we are using it from the parent process, which causes side effects.

To resolve this issue properly, we should keep the same approach but use an instance variable instead of a static variable.

sanikolaev commented 2 months ago

The related PR is https://github.com/manticoresoftware/manticoresearch-php/pull/209/files

@donhardman pls review

donhardman commented 2 months ago

Looks good! I've double-checked everything, and it all looks great. We can close this issue once the pull request is merged. 👍

Nick-S-2018 commented 2 months ago

Done in https://github.com/manticoresoftware/manticoresearch-php/pull/209