phar-io / phive

The Phar Installation and Verification Environment (PHIVE)
https://phar.io
BSD 3-Clause "New" or "Revised" License
579 stars 43 forks source link

"No RateLimit present in response" on Windows? #374

Open mvriel opened 2 years ago

mvriel commented 2 years ago

I was trying to fix the build in phpDocumentor when I discovered that the problem was that we were repeatedly getting the "No Ratelimit present in response" error when using Phive in our Windows machines (see https://github.com/phpDocumentor/phpDocumentor/runs/7249978421 for an example).

At first I was like, it must be a ratelimit issue; but I cache the phive folders now and on Windows it gives this error. (Sidenote: for some reason the hash of the phive.xml is different on windows vs mac and linux; hence it doesn't use the cache created on linux and windows).

What could this issue be? We are using Phive 0.15.1

theseer commented 2 years ago

I'm not sure how this can be a windows only problem ;)

In general, the "No RateLimit present in response" is coming from an exception thrown when phive wants to ensure that, mainly when talking to Github, we are within our RateLimit and the response doesn't have a rate limit information set. See https://github.com/phar-io/phive/blob/master/src/shared/http/HttpResponse.php#L76.

HttpResponse is the result object from our CurlHttpClient, the response headers are parsed here: https://github.com/phar-io/phive/blob/master/src/shared/http/CurlHttpClient.php#L103. I cannot see why this would work differently on windows than on the other operating systems.

Regarding caching: A wild guess would be a difference in paths? Would be interesting to have a look, but I don't have much at right now...

mvriel commented 2 years ago

I wouldn't know either why it would be windows specific; but that is what the tests show :) The cache difference if probably a line ending issue. The hashing function hashes the file contents and windows uses, obviously, a different line ending (and 2 bytes instead of 1)

mvriel commented 2 years ago

I may have something; while working on the builds I noticed that the Windows instances of Github Action are incredibly slow. Would it be possible that Phive hits a network timeout, receives an incorrect / bad response and that is why we are seeing this?

theseer commented 2 years ago

Technically possible, but that should result in either a curl error or non-200 http response code. And we SHOULD cover that.

Am Freitag, 8. Juli 2022 schrieb Mike van Riel:

I may have something; while working on the builds I noticed that the Windows instances of Github Action are incredibly slow. Would it be possible that Phive hits a network timeout, receives an incorrect / bad response and that is why we are seeing this?

-- Reply to this email directly or view it on GitHub: https://github.com/phar-io/phive/issues/374#issuecomment-1179000304 You are receiving this because you commented.

Message ID: @.***

mvriel commented 2 years ago

A quick scan shows that at least on https://github.com/phar-io/phive/blob/dac3e6f4b895489cfb7e2be637a43de7e0e0e6b2/src/services/resolver/GithubAliasResolver.php#L98-L100, there is no verification whether the result succeeded and when I check https://github.com/phar-io/phive/blob/7f7f5fbf918ee84d67ce734d4f0bf3214f4b264a/src/shared/http/CurlHttpClient.php#L183-L185 I can see that any 4xx and 5xx response will also result in a valid response object without rate limit headers included.

I do not know whether this is the location that actually causes the issue described above; but the combination of these 2 locations will match the described behaviour if there is not another part of the code that will compensate for this

theseer commented 2 years ago

Good point. Let me debug this next week :)

ravage84 commented 11 months ago

Likely related to https://github.com/phar-io/phive/issues/228

Ran into this today again with a fresh installation of 0.15.2 on Windows 10.

JSabelli-FDX commented 3 weeks ago

Same here:

$ phive install phpmetrics/PhpMetrics
Phive 0.15.2 - Copyright (C) 2015-2024 by Arne Blankerts, Sebastian Heuer and Contributors
[ERROR]   No RateLimit present in response
jrfnl commented 3 weeks ago

Dito (Windows 10, local).

$ phive selfupdate
Phive 0.15.2 - Copyright (C) 2015-2024 by Arne Blankerts, Sebastian Heuer and Contributors
[ERROR]    No RateLimit present in response

Tested again with 0.15.3 after a manual update, same result.

@theseer Please let me know if I can help with debugging this on Windows.

Just to satisfy my own curiousity, I've ran it from source with some var_exports in select places and this was the result:

Phive 0.15.3-dirty - Copyright (C) 2015-2024 by Arne Blankerts, Sebastian Heuer and Contributors
$this->url in Curl::exec
'https://api.github.com/rate_limit'
$result in Curl::exec
'{"resources":{"core":{"limit":60,"remaining":60,"reset":1724369210,"used":0,"resource":"core"},"graphql":{"limit":0,"remaining":0,"reset":1724369210,"used":0,"resource":"graphql"},"integration_manifest":{"limit":5000,"remaining":5000,"reset":1724369210,"used":0,"resource":"integration_manifest"},"search":{"limit":10,"remaining":10,"reset":1724365670,"used":0,"resource":"search"}},"rate":{"limit":60,"remaining":60,"reset":1724369210,"used":0,"resource":"core"}}
'
$this->rateLimitHeaders in CurlHttpClient::parseRateLimitHeaders
array (
)
[ERROR]    No RateLimit present in response

Note: I also added debug statements in CurlHttpClient::handleHeaderInput() and Curl::addHttpHeaders(), but those never get called.

Let me know if there are other things you'd like me to check.

theseer commented 3 weeks ago

We do not actually look at the result body from the rate_limit API call but the headers.

It should look like this (trimmed for readability):

$ curl -v https://api.github.com/rate_limit
* Host api.github.com:443 was resolved.
[...]
> GET /rate_limit HTTP/2
> Host: api.github.com
> User-Agent: curl/8.6.0
> Accept: */*
> 
[...]
< HTTP/2 200 
[...]
< content-type: application/json; charset=utf-8
[...]
< x-ratelimit-limit: 60
< x-ratelimit-remaining: 59
< x-ratelimit-reset: 1724439730
< x-ratelimit-resource: core
< x-ratelimit-used: 1
[..]< 
{
  "resources": {
    "core": {
      "limit": 60,
      "remaining": 59,
      "reset": 1724439730,
      "used": 1,
      "resource": "core"
    },
[...]
}

The code responsible for parsing those headers is indeed in CurlHttpClient::handleHeaderInput. I can barely believe it not be called - that would make no sense as there are certainly header lines.

I don't know where you added the debug line you mentioned? I do a regex check on the header name to find the relevant X-RateLimit-* lines to add to my list. It's the only point in the code where it could go wrong ;) - even though I have no idea why or how.

Can you add debug output to the very top of the method to see what it receives?

theseer commented 3 weeks ago

For the record, on my linux box, i added a var_dump into line CurlHttpClient::handleHeaderInput after the explode in line 105: var_dump($parts);:

array(2) {
  [0]=>
  string(17) "x-ratelimit-limit"
  [1]=>
  string(5) " 5000"
}
array(2) {
  [0]=>
  string(21) "x-ratelimit-remaining"
  [1]=>
  string(5) " 4999"
}
array(2) {
  [0]=>
  string(17) "x-ratelimit-reset"
  [1]=>
  string(11) " 1724440954"
}
array(2) {
  [0]=>
  string(16) "x-ratelimit-used"
  [1]=>
  string(2) " 1"
}
array(2) {
  [0]=>
  string(20) "x-ratelimit-resource"
  [1]=>
  string(5) " core"
}

(That's of course only the extract of the x-ratelimit-*)

I cannot see how my regex '/^(X-)?RateLimit-(.*)$/i' would fail to match this.

jrfnl commented 3 weeks ago

@theseer In my debug code I had a var-dump()-like statement as the very first thing in CurlHttpClient::handleHeaderInput().

This was my debug patch:

 src/shared/http/Curl.php           | 4 +++-
 src/shared/http/CurlHttpClient.php | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/shared/http/Curl.php b/src/shared/http/Curl.php
index 77f3892..786676f 100644
--- a/src/shared/http/Curl.php
+++ b/src/shared/http/Curl.php
@@ -49,6 +49,7 @@ class Curl {
     }

     public function addHttpHeaders(array $headers): void {
+echo '$headers in Curl::addHttpHeaders', PHP_EOL, var_export($headers, true), PHP_EOL;
         $this->options[CURLOPT_HTTPHEADER] = $headers;
     }

@@ -81,6 +82,7 @@ class Curl {
      * @throws CurlException
      */
     public function exec(): string {
+echo '$this->url in Curl::exec', PHP_EOL, var_export($this->url, true), PHP_EOL;
         $ch = curl_init($this->url);
         assert($ch !== false);
         curl_setopt_array($ch, $this->options);
@@ -90,7 +92,7 @@ class Curl {
         if ($result === false) {
             throw new CurlException(curl_error($ch), curl_errno($ch));
         }
-
+echo '$result in Curl::exec', PHP_EOL, var_export($result, true), PHP_EOL;
         return (string)$result;
     }

diff --git a/src/shared/http/CurlHttpClient.php b/src/shared/http/CurlHttpClient.php
index 48358dd..aae65a7 100644
--- a/src/shared/http/CurlHttpClient.php
+++ b/src/shared/http/CurlHttpClient.php
@@ -100,6 +100,7 @@ class CurlHttpClient implements HttpClient {
      * @param resource $ch
      */
     public function handleHeaderInput($ch, string $line): int {
+echo '$line in CurlHttpClient::handleHeaderInput', PHP_EOL, var_export($line, true), PHP_EOL;
         $parts = explode(':', trim($line));

         if (!isset($parts[1])) {
@@ -190,6 +191,7 @@ class CurlHttpClient implements HttpClient {
     }

     private function parseRateLimitHeaders(): ?RateLimit {
+echo '$this->rateLimitHeaders in CurlHttpClient::parseRateLimitHeaders', PHP_EOL, var_export($this->rateLimitHeaders, true), PHP_EOL;
         $required = ['Limit', 'Remaining', 'Reset'];
         $existing = array_keys($this->rateLimitHeaders);
jrfnl commented 3 weeks ago

For the record, I tested with PHP 8.3.10 (in case it makes a difference).

Happy to try/test other things

theseer commented 3 weeks ago

@jrfnl That would imply that - on windows - registering a handler for Header lines is .. broken? I cannot believe that...

jrfnl commented 3 weeks ago

@jrfnl That would imply that - on windows - registering a handler for Header lines is .. broken? I cannot believe that...

@theseer Okay, I've done yet more testing and that's not the case for a plain curl PHP script.

$callbackCalled = 0;

function curl_header_callback($curl_handle, $data) {
    echo $data, PHP_EOL;
    ++$GLOBALS['callbackCalled'];
}

$ch = curl_init('https://api.github.com/rate_limit');
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADERFUNCTION, 'curl_header_callback');
$result = curl_exec($ch);
echo $result, PHP_EOL;
curl_close($ch);

if ($callbackCalled === 0) {
    echo 'ERROR: callback not called';
    exit(1);
} else {
    echo 'callback was called';
    exit(0);
}

... results in (a very slow):

$ php ./phive-374.php
HTTP/2 403

callback was called

I've also done some more testing using the code within Phive.

It looks like the headers don't get included in the $result from curl_exec(), even though they do get received (curl_getinfo($ch, CURLINFO_HEADER_SIZE) = 883). Possibly CURLOPT_NOBODY does not get set correctly ? Having said that, that still shouldn't make a difference as commenting that line out in the "plain" script still gets the callback called.

I've also tested with passing another URL in the Phive Curl::exec() method and am seeing the same - $result not including the headers.

I've tried adding $this->options[CURLOPT_HEADER] = true in Curl::setHeaderFunction(), but that didn't make a difference. Ref: https://www.php.net/manual/en/curl.constants.php#constant.curlopt-header

For arguments sake, I've set up a little GH Actions script to demo the issue and a test run confirms the issue can be reproduced on Windows in the GH Actions environment too.

I've also pushed the debug commits onto that same branch. The first debug commit containing the debug statements I added yesterday. The second commit contains additional things I've tried and added today. You can see the output of a run with the GHA demo script including these debug statements here: https://github.com/jrfnl/phive/actions/runs/10537359249/job/29198669327

I suspect it may be something to do with the combination of options passed to curl via curl_setopt_array(), but that's not that easy to debug what with those all being integer constants... I can see nine options being added in the Curl class, but 17 in total being added via the curl_setopt_array() call in Curl::exec().

Happy to try more things. Guidance appreciated.

JSabelli-FDX commented 3 weeks ago

After manually updating to the newest version, the No RateLimit present in response seems to still be returning when I try to do phive selfupdate . Could just be because I'm already using the latest version but other than that, install and update work again. Thank you!

theseer commented 3 weeks ago

@jrfnl So you're telling me there must be a combination of flags that - on windows only - breaks header handling? Any chance to verify that? Because it works reliably in Linux and apparently macos.

jrfnl commented 3 weeks ago

@jrfnl So you're telling me there must be a combination of flags that - on windows only - breaks header handling? Any chance to verify that? Because it works reliably in Linux and apparently macos.

@theseer Not what I'm "telling you", just what I suspect may be happening. You can see the output of the debug code in the GH Actions run for yourself.

theseer commented 2 weeks ago

Am Dienstag, 27. August 2024 schrieb Juliette:

@jrfnl So you're telling me there must be a combination of flags that - on windows only - breaks header handling? Any chance to verify that? Because it works reliably in Linux and apparently macos.

@theseer Not what I'm "telling you", just what I suspect may be happening. You can see the output of the debug code in the GH Actions run for yourself.

I got that, and it was not meant as a critique rather disbelieve despite seeing it.

As I'm currently quite unfit - still recovering from a nasty multiday fever: any chance to spin some sort of matrix run to see which one might be the culprit?

-- Reply to this email directly or view it on GitHub: https://github.com/phar-io/phive/issues/374#issuecomment-2313302941 You are receiving this because you were mentioned.

Message ID: @.***

jrfnl commented 2 weeks ago

As I'm currently quite unfit - still recovering from a nasty multiday fever: any chance to spin some sort of matrix run to see which one might be the culprit?

@theseer I'd be happy to, but it's a bit difficult to figure out what should be in such a matrix:

I suspect it may be something to do with the combination of options passed to curl via curl_setopt_array(), but that's not that easy to debug what with those all being integer constants... I can see nine options being added in the Curl class, but 17 in total being added via the curl_setopt_array() call in Curl::exec().

I'm going to have to find out what the other 8 options are which are being passed to be able to set up a usable test scenario.

Won't have time for this till Thursday at the earliest though.

In the mean time: wishing you a speedy recovery!