Open eazrael opened 4 years ago
I have the same issue on 7.4, we send all emails through HTML purifier and normally the process stops at 2500 emails (then a few hundred MB memory). With $config->set('Cache.DefinitionImpl', null);
, memory consumption stays low (14 MB) but it is not as fast.
I'm in the process of switching to 7.4, and I was worried about your issue regarding memory consumption/leak. So I'm testing, and I cannot reproduce your problem.
I have 84 .eml files of HTML emails, 953 MB in total. The biggest file is 52 MB, the smallest 3 KB
4 testcases in total:
PHP 7.1 HTMLPurifier 4.9.3 PHP 7.1 HTMLPurifier 4.13.0 PHP 7.4 HTMLPurifier 4.9.3 PHP 7.4 HTMLPurifier 4.13.0
As you can see below, all 4 have roughly the same memory usage. PHP 7.4 uses 1% more memory, but is 10% faster. (The Deprecation notices in case 3 are expected, HTMLPurifier 4.9.3 is not compatible with PHP 7.4)
$ /usr/bin/php7.1 htmlpurify1.php old
PHP Version: 7.1.33-34+ubuntu18.04.1+deb.sury.org+1
HTMLPurifier Version: 4.9.3
Memory Usage: 195.29 MB
Memory Real Usage: 213.36 MB
Seconds: 40.261646032333
$ /usr/bin/php7.1 htmlpurify1.php new
PHP Version: 7.1.33-34+ubuntu18.04.1+deb.sury.org+1
HTMLPurifier Version: 4.13.0
Memory Usage: 195.35 MB
Memory Real Usage: 213.36 MB
Seconds: 41.45220208168
$ /usr/bin/php7.4 htmlpurify1.php old
Deprecated: Array and string offset access syntax with curly braces is deprecated in /htmlpurifier-4.9.3/library/HTMLPurifier/Encoder.php on line 162
Deprecated: Array and string offset access syntax with curly braces is deprecated in /htmlpurifier-4.9.3/library/HTMLPurifier/ChildDef/Custom.php on line 48
Deprecated: Array and string offset access syntax with curly braces is deprecated in /htmlpurifier-4.9.3/library/HTMLPurifier/TagTransform/Font.php on line 78
Deprecated: Array and string offset access syntax with curly braces is deprecated in /htmlpurifier-4.9.3/library/HTMLPurifier/TagTransform/Font.php on line 78
Deprecated: __autoload() is deprecated, use spl_autoload_register() instead in /htmlpurifier-4.9.3/library/HTMLPurifier.autoload.php on line 17
PHP Version: 7.4.21
HTMLPurifier Version: 4.9.3
Memory Usage: 196.16 MB
Memory Real Usage: 215.45 MB
Seconds: 35.772937059402
$ /usr/bin/php7.4 htmlpurify1.php new
PHP Version: 7.4.21
HTMLPurifier Version: 4.13.0
Memory Usage: 196.22 MB
Memory Real Usage: 215.45 MB
Seconds: 36.35814499855
If I just purify the largest 52 MB file, I get these numbers:
$ /usr/bin/php7.4 htmlpurify1.php new
PHP Version: 7.4.21
HTMLPurifier Version: 4.13.0
Memory Usage: 184.86 MB
Memory Real Usage: 186.09 MB
Seconds: 0.97783088684082
Purifying 953 MB instead of 52 MB is increasing the memory a bit, but not that much.
If I disable the cache $config->set('Cache.DefinitionImpl', null);
it does not change anything, the memory consumption and runtime is the same. If I enable the Cache, it generates .ser files. But at least in my case it does not bring any performance improvements...
@eazrael @jahrralf Can you provide testfiles, so I can reproduce your performance problems?
Sorry - I cannot provide test data.
I am currently investigation a memory leak with PHP 7.4.x. I upgraded from 7.2.26 to 7.4.10 and subsequently htmlpurifier from 4.10 to 4.13 as 4.10 is not PHP 7.4 compatible. Since then I have a huge issue with memory leaks, in my application a couple dozen of calls can leak ~512MB. I am still investigating the root cause, but I hope somebody has an idea what might happen. I will try to strip down my application to the minimum code required for reproducing the issue.
Things I found out so far:
config:
More info will follow.