The existing logic on refreshRules() in the Manager service causes intermittent issues on applications using Redis in a replication cluster to cache the results.
Once retrieving the rules using getRules(), if the data is not available in cache the following code is executed:
if (null === $data && !$this->refreshRules($url, $ttl)) {
throw new CouldNotLoadRules(sprintf('Unable to load the public suffix list rules for %s', $url));
}
The rules are refreshed as intended (via the CurlHttpClient), however right after this is processed the library attempts to json_decode() the data from cache again:
For applications that use Redis in a replication cluster, the key might not yet be available to be accessed from cache (since the process is executed extremely fast), and causes the issue below intermittently:
┐
├ TypeError: json_decode() expects parameter 1 to be string, null given
│
╵ /var/www/app/vendor/jeremykendall/php-domain-parser/src/Manager.php:109
Do we think that instead of returning a bool as a result of refreshRules(), would be best to return the data that was parsed and cached instead? This in my opinion could make this logic more efficient since we won't need to access the cache twice.
For example, if we could do something similar to the following instead:
if (null === $data && null === ($data = $this->refreshRules($url, $ttl))) {
throw new CouldNotLoadRules(sprintf('Unable to load the public suffix list rules for %s', $url));
}
All we would have to do is:
$data = json_decode($data, true);
Thoughts?
P.S.: As a workaround, I added a retry-policy on my cache pool that I have configured for this library. This basically gives me the ability to retry accessing the key again after a few milliseconds.
Issue summary
The existing logic on
refreshRules()
in the Manager service causes intermittent issues on applications using Redis in a replication cluster to cache the results.Once retrieving the rules using
getRules()
, if the data is not available in cache the following code is executed:The rules are refreshed as intended (via the
CurlHttpClient
), however right after this is processed the library attempts tojson_decode()
the data from cache again:For applications that use Redis in a replication cluster, the key might not yet be available to be accessed from cache (since the process is executed extremely fast), and causes the issue below intermittently:
Do we think that instead of returning a
bool
as a result ofrefreshRules()
, would be best to return the data that was parsed and cached instead? This in my opinion could make this logic more efficient since we won't need to access the cache twice.For example, if we could do something similar to the following instead:
All we would have to do is:
Thoughts?
P.S.: As a workaround, I added a retry-policy on my cache pool that I have configured for this library. This basically gives me the ability to retry accessing the key again after a few milliseconds.
System informations