Closed kurtinge closed 8 years ago
Hello @kurtinge thank you for the PR, this is really an importtant topic. Nice. Let's ignore the CI errors for now. Unfortunately I didn't like the idea of storing the content-type into an other cache entry, but I know the current implementation with the time isn't permitting that an other way.
$this->_getFpc()->save(time() . $body, $key, $this->_cacheTags);
Maybe we can find with some benchmarks better ways to store more information in just one entry. Long ago I had serialize and unserialize, but it has been replaced, cause the timestamp will be stable 10 chars until January 19, 2038.
I'll see if I can figure out some way of putting it into the same cache entry as the time and body
I made some small benchmarks with a funny result.
<?php
// objectSerializeBenchmark.php
$contentLength = 400 * 200;
$content = generateRandomString($contentLength);
$time = time();
$expectedResult = [
$time,
$content,
];
$concat = function ($content, $time) {
return $time . $content;
};
$substr = function ($object) {
return [
(int)substr($object, 0, 10),
substr($object, 10),
];
};
$serialze = function ($content, $time) {
$object = [
$time,
$content,
];
return serialize($object);
};
$unserialize = function ($object) {
return unserialize($object);
};
$jsonEncode = function ($content, $time) {
$object = [
$time,
$content,
];
return json_encode($object);
};
$jsonDecode = function ($object) {
return json_decode($object);
};
echo "Concat " . benchmarkFunction($concat, [$content, $time]) . "\n";
echo "Substr " . benchmarkFunction($substr, [$concat($content, $time)], $expectedResult) . "\n";
echo "Serialize " . benchmarkFunction($serialze, [$content, $time]) . "\n";
echo "Unserialize " . benchmarkFunction($unserialize, [$serialze($content, $time)], $expectedResult) . "\n";
echo "JsonEncode " . benchmarkFunction($jsonEncode, [$content, $time]) . "\n";
echo "JsonDencode " . benchmarkFunction($jsonDecode, [$jsonEncode($content, $time)], $expectedResult) . "\n";
function benchmarkFunction($callback, $args, $expectedResult = null, $times = 10000) {
$result = null;
$start = microtime(true);
for ($i = 0; $i < $times; $i++) {
$result = call_user_func_array($callback, $args);
}
$totalTime = microtime(true) - $start;
if (!is_null($expectedResult) && $result !== $expectedResult) {
throw new Exception("Wrong result yo!.");
}
return $totalTime;
}
/**
* shameless copy of http://stackoverflow.com/questions/4356289/php-random-string-generator
*/
function generateRandomString($length = 10) {
$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $length; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}
The result of docker run -v
pwd/objectSerializeBenchmark.php:/objectSerializeBenchmark.php php:5-cli php /objectSerializeBenchmark.php
:
Concat 0.084057092666626
Substr 0.073863983154297
Serialize 0.061403036117554
Unserialize 0.060228109359741
JsonEncode 7.5650370121002
JsonDencode 7.9326219558716
The result of docker run -v
pwd/objectSerializeBenchmark.php:/objectSerializeBenchmark.php php:7-cli php /objectSerializeBenchmark.php
:
Concat 0.07352089881897
Substr 0.057359933853149
Serialize 0.049885034561157
Unserialize 0.043100118637085
JsonEncode 4.9944760799408
JsonDencode 3.683972120285
I guess serialze
& unserialize
are maybe the best solution for this kind of problem. json_encode
& json_decode
seems not to like big strings a lot. @kurtinge what did you think?
If somebody else is also a little bit surprised. Here is the reason why serialize
& unserialize
are so fast with long strings.
<?php
$array = [
time(),
'ONE BIG STRING',
];
echo serialize($array);
a:2:{i:0;i:1454931158;i:1;s:14:"ONE BIG STRING";}
serialize
is storing how long a string is and so unserialze
doesn't have to parse that long string.
Based on the tests you provide I also think that the serializer is the best solution for this. I will try to implement a version with serialize during the week.
what about using an split character? could be faster to have some header at the beginning - comma separated and then do a strpos() for some utf8 character?
@riconeitzel a split character has the problem, that it can only work if you escape this character in the content. And the funny thing is, serialize
is much faster than splitting, also faster as the current version with substr
. I only made the mistake not to benchmark this at this PR https://github.com/GordonLesti/Lesti_Fpc/pull/63
I will merge this now into the branch develop
and make the changes with serialize
and unserialize
after that.
The content-type header should be cached so the correct content-type is served for cached elements. IE: Ajax requets that returns application/json, should always return the content-type application/json.