browscap / browscap-php

Officially supported Browscap for PHP
http://browscap.org/
MIT License
423 stars 84 forks source link

memory exhausted errors #20

Closed whiteatom closed 9 years ago

whiteatom commented 10 years ago

I have been a long time users of Browscap v1.0, however, I decided to upgrade to the newest version. Now I am getting constant out of memory errors in the Browscap script.

Fatal error: Allowed memory size of 201326592 bytes exhausted (tried to allocate 32 bytes) in /www/tracker/dev/incs/Browscap.php on line 683

I have had to increase the memory limit 3 times (up to 256MB) to get it to work.

Why is this new script so inefficient on memory? is there a fixed planned as it is not sustainable to run a production server with a single php script requiring that much memory.

mimmi20 commented 10 years ago

Which version of the browscap file do you use (5026 full)?

whiteatom commented 10 years ago

the latest version of the Browscap.php file uses the full one.. I was hoping to switch to the light, but there is a comment above that says I can't.

(note: light version doesn't work, a fix is on its way)

I would assume that the php script is pulling the latest file 5027 I believe.

mikesitguys commented 10 years ago

I too have this same issue, i have been using Browscap (the latest version and the full version), for 2 months, and this problem just occurred yesterday! Even increasing the memory limit to 256MB, doesn't work, or switching to the light version of the ini file.

asgrim commented 10 years ago

Hi there - we are aware of the increased file size causing memory issues in some cases. The problem is caused by the sizes of the INI file dramatically increasing since the project has become more active, and is simply as a result of many more user agents being defined in the file, thus the file has grown quite large.

There are a couple of potential solutions I see to this:

It is also worth noting that our official recommendation is:

For example:

// When parsing the UA on the site
$browscap = new phpbrowscap\Browscap($cacheDir);
$browscap->doAutoUpdate = false;
$information = $browscap->getBrowser($userAgent);

and then in a background processed script (e.g. on a cron job) that gets run once a day/week:

$browscap = new phpbrowscap\Browscap($cacheDir);
$browscap->updateCache();

This way, it is only the background script that would consume large amounts of memory, not your actual website.

That said, we are aware of the issue and it is something we definitely want to fix. Pull requests welcome, as always ;)

whiteatom commented 10 years ago

Thanks for the reply.... but..... What your saying is that there is a bug that pretty much makes the entire project unusable and you're just going to close the ticket? I might suggest you leave this open so that every other person who tries to use your software and can't might have some idea why.

This issue you have referenced is obviously related.. but to a newbie trying out your project for the first time.. it might appear that it just doesn't work - and an unsupported operand bug doesn't really help.

asgrim commented 10 years ago

I did not say I was going to close this ticket, I acknowledged that this is a problem and we are considering solutions. Just because the "unsupported operand" issue is closed it does not mean it won't appear in search results, and if someone has the same issue, they'll end up here after reading that issue.

The project is not unusable you just need to increase the memory limit. Additionally, I have provided you a workaround which means your production servers are minimally affected until we are able to fix this issue in a decent way.

Please don't make assumptions based on no evidence - this issue is staying firmly open until we fix it, rest assured :)

asgrim commented 10 years ago

@whiteatom one of the proposals we are making is to re-arrange what is included in each file - any feedback you have would be appreciated over at browscap/browscap#248 :)

cziegenberg commented 10 years ago

I'm currently working on it. After a few first changes I could reduce the memory usage by 8% percent (156M -> 143M).

Question: Which is the minimum PHP version to support?

asgrim commented 10 years ago

@ziege what method are you using?

asgrim commented 10 years ago

@ziege and in answer to your question, currently >= 5.3, we haven't discussed increasing this to 5.5 yet.

cziegenberg commented 10 years ago

First change: unset all variables as early as possible in updateCache(). For the next changes I first have to look a bit deeper into the code...

cziegenberg commented 10 years ago

After the second change: 25% reduction (156M -> 117M). More coming soon.

cziegenberg commented 10 years ago

Good news, I could reduce the memory peak usage by nearly 60%.

I'm currently testing on another system (with PHP 5.6.0beta1) and could optimize the usage there from 101.92M to 41.45M. It will be hard to optimize it more, because PHP requires 39.12M for parsing the ini file into the $browsers array, which can only be changed by changing the ini file. Tomorrow I will test it once again with the previous system (PHP 5.5.x).

What did I do?

I also fixed a caching bug, see pull request.

cziegenberg commented 10 years ago

Test results with PHP 5.5.11: Old version: Memory peak 167.87M New version: 75.43M Reduction: 54,82%

In PHP 5.5.11 the parsing of the INI file requires much more memory than in PHP 5.6 (61M compared to 39M), which explains the huge difference.

I will prepare a pull request now.

asgrim commented 10 years ago

That's fantastic, and welcome news :)

xavierhb commented 10 years ago

@ziege nice work! Share it soon! :+1:

cziegenberg commented 10 years ago

@xavierhb See pull request #26

whiteatom commented 10 years ago

Is there a beta build I can try out? I am now getting memory exhausted errors with PHP set to 256MB.

whiteatom commented 10 years ago

@asgrim In response to your message above, I am not trying to be insulting.. but anyone on shared hosting won't be able to increase their memory allocation - and cron jobs are probably above many people using this project. It is a fantastic effort, I was just pretty put off to have a major but just closed on me as soon as I reported it.

DaAwesomeP commented 10 years ago

I think another fix would be to implement some sort of segmented downloading. It would download the first 4mb, save it, then the next 4mb etc. When it finishes it would combine them all into a single file.

The idea of a cron job would be perfect, but not for most people. A simple require or include runs synchronously and waits for a result before moving on, but maybe a user accessing a webpage could trigger an update, but PHP would not wait for the update to finish. The idea is that the updater runs in the background (seperated from the server), but is triggered by a user. This way newbies who just want to get up and running won'y have to mess with cron. The same currently implemented timer system will be used, but will trigger a different event.

There is an answer on StackOverflow to solve this problem (Is there a way to use shell_exec without waiting for the command to complete?):

How about adding.

"> /dev/null 2>/dev/null &"
shell_exec('php measurePerformance.php 47 844 email@yahoo.com > /dev/null 2>/dev/null &');

Note this also gets rid of the stdio and stderr.

asgrim commented 10 years ago

@whiteatom you could try checking out PR #26 and see if that works for you, @ziege has done s load of work to improve it, but I have not had a chance to look over it yet - any feedback is gratefully received :)

@DaAwesomeP I believe PR #26 includes this sort of staged parsing to reduce memory usage, I just need to check it out. Memory improvements are on the way! :)

asgrim commented 10 years ago

@whiteatom also, I'm really not sure where you get the idea that I closed this ticket, if you look through I have never had the intention of closing this issue at all; I acknowledged it was a problem that we need to deal with and I provided a possible workaround for the interim while we fix the issue.

cziegenberg commented 10 years ago

@DaAwesomeP The INI file is only about 6-7MB and that wasn't a problem in my tests. Parsing it with parse_ini_file was a problem (caused a memory peak of about 40MB). I found a way to reduce it by about 25% (by splitting the file and using parse_ini_string instead), but this was unnecessarily complicated and didn't reduce the memory peak caused by the following steps.

cziegenberg commented 10 years ago

After my changes the remaining problem is the large array that results from parsing the INI file (and the additional arrays created from it to optimize the array search operations).

One way to optimize it for the future would be to process the INI file in multiple steps. I tried it by splitting it into the sections marked by included comments (+ the default settings, which should be available for all sections), but noticed that some "Parent" references link to other sections, so that this didn't work. I think it will be difficult to realize this.

Another way would be to download a prepared "cache file" directly instead of the INI file. I'm currently testing it, because some changes are required - but if it works and such a file could be provided by the browscap server, we could reduce the memory peak for the update to about 4MB...

cziegenberg commented 10 years ago

This is my new idea to optimize memory usage:

This solution should work for much more browscap data and only requires some prepared browscap JSON files, structured very similar like the current cache files (also with the double encoded values, to optize performance and memory usage - this can be confusing if used with other clients).

Possible improvements:

My questions:

mimmi20 commented 10 years ago

@ziege Could you test this with the output of PR browscap/browscap#244?

cziegenberg commented 10 years ago

@mimmi20 I don't think this will work. The JSON files I created are very similar to the current cache file, using a special structure, internal IDs, double JSON encoding,... The linked JSON format for browscap would replace the INI file and result in a similar high memory usage, because all data have to be loaded into one big array (or object).

Here a short version of the JSON file with the browser data, where you can see the double encoding which simplifies the loaded structure (results in an array with JSON strings which are only parsed in a second step when necessary):

{"source_version":"5028","cache_version":"2.0b","browsers":["{\"3\":42286,\"5\":\"TKC AutoDownloader\"}","{\"3\":41831,\"9\":\"iOS\",\"10\":\"4.1\"}","{\"3\":41831,\"9\":\"iOS\",\"10\":\"6.0\"}","{\"3\":42784,\"9\":\"iOS\",\"10\":\"6.0\",\"25\":\"true\"}","{\"3\":41831,\"9\":\"JAVA\"}","{\"3\":41926,\"10\":\"2.3\"}","{\"3\":41900,\"10\":\"2.3\"}","{\"3\":41929,\"10\":\"2.3\"}","{\"3\":42314,\"5\":\"CrystalSemanticsBot\",\"9\":\"WinXP\",\"10\":\"5.1\",\"14\":\"true\"}","{\"3\":42831,\"10\":\"6.1\"}","{\"3\":44033,\"5\":\"BrowserNG\",\"6\":\"7.2\",\"7\":\"7\",\"8\":\"2\",\"9\":\"SymbianOS\"}","{\"3\":42831,\"10\":\"6.1\",\"26\":\"true\"}","{\"3\":43004,\"9\":\"WinCE\"}"]}

cziegenberg commented 10 years ago

Yesterday I gave it a new try. I completely rewrote the class, found a new way to parse the INI file, could optimize the search for the user agent and reduce the memory by about 60% again. I'm currently work on the caching mechanism, but I'm very optimistic, because I found a super fast, low-memory way for it... Give me some time to finish a first version.

cziegenberg commented 10 years ago

Some first test results for you:

Generation of new cache data (including a search): Memory peak: 23.31M Time: 3.14 seconds

Search (with cached data): Memory peak: 1.04M (yes, really less than 1% of the current version!) Time: 0.19 seconds (depends on the user agent, I used the same as for the previous tests)

mimmi20 commented 10 years ago

@ziege Could you test this with the output of PR browscap/browscap#297? If a change of the output file is required to get the memory usage down, please tell me.

cziegenberg commented 10 years ago

@mimmi20 I don't use JSON anymore, because I found a better way. With PHP 5.5 and the use of generators I could reduce the memory usage to about 0,23M (with cached data). The rewrite is nearly finished, I'm testing and optimizing at the moment and will publish the code soon.

asgrim commented 10 years ago

Hmm, that sounds like perfect use for generators actually, but does require php 5.5... Should we consider dropping older versions of PHP? On 30 May 2014 18:24, "ziege" notifications@github.com wrote:

@mimmi20 https://github.com/mimmi20 I don't use JSON anymore, because I found a better way. With PHP 5.5 and the use of generators I could reduce the memory usage to about 0,23M (with cached data). The rewrite is nearly finished, I'm testing and optimizing at the moment and will publish the code soon.

Reply to this email directly or view it on GitHub https://github.com/browscap/browscap-php/issues/20#issuecomment-44676651 .

DaAwesomeP commented 10 years ago

Personally, I always use PHP 5.5 in my dev servers, but as of now most hosting providers use 5.3. Some features can be manually ported over like password_compat, which adds 5.5's bcrypt password features to earlier versions.

I found this though:

Generators do not add any capability to the language at all. Everything that you can do with a generator, you can already do with an iterator. http://blog.ircmaxell.com/2012/07/what-generators-can-do-for-you.html

I'm not sure how this plays up to memory comparison though.

cziegenberg commented 10 years ago

@asgrim As mentioned before, I build the class(es) from scratch, so the code has nothing to do anymore with the current class, and it won't be backwards compatible. I will publish it as a separate project, so that parts of it can be merge into the current version or it can be used as a future version?!

It contains several optimizations and I separated the main component from the cache, the source parser and the updater, so that each component can be extended/replaced.

The PHP 5.5 features I used could be replaced so that more memory is used but it's compatible with PHP 5.3 again.

DaAwesomeP commented 10 years ago

There could simply be an if statement that uses the new method (generators) on 5.5 and above and the older method for older versions. It could also put out a warning when using the older version.

cziegenberg commented 10 years ago

@DaAwesomeP I think you have to put it into a separate file, because the parser in PHP 5.3 won't understand "yield".

DaAwesomeP commented 10 years ago

@ziege That would be fine. We could have the if/else load a require or include.

cziegenberg commented 10 years ago

I optimized the caching once again, here are the latest test results (on Windows 7/PHP 5.6beta3):

Uncached (no including downloading a new browscap.ini): Memory: 21.69M Time: 3.042s

Cached (for empty or unknown user agent): Memory: 0.23M Time: 0.000s

Cached (for known user agent): Memory: 0.23M Time: 0.015s

PHP get_browser() function (same for all user agents): Memory: 0.16M Time: 0.720s

cziegenberg commented 10 years ago

Version 0.1 now available (with PHP 5.3 support): https://github.com/crossjoin/Browscap

Or via composer: https://packagist.org/packages/crossjoin/browscap

ammont commented 10 years ago

Ok what is going to happen to this project?!

Rikaelus commented 10 years ago

Anyone know what's going to happen with this? I'm encountering the memory problems again and was hoping to have seen this discussion evolve into some actual improvements to browscap-php.

I've tried ziege's variant but it's trying to instanciate classes from files that don't seem to be included previously, throwing a fatal error: Fatal error: Class 'Crossjoin\Browscap\Parser\IniLt55' not found in .../packages/Crossjoin/Browscap/Browscap.php on line 167

Before going in to fix that problem I just wanted to see what was happening here...

cziegenberg commented 10 years ago

I just updated the README, regarding the autoloading configuration (if anybody else had the same problem).

ammont commented 10 years ago

I don't want to use another package, what is going to happen to browscap-php?!!

asgrim commented 10 years ago

We are still looking into this. As I've said before, this is an open source project that people contribute to in their free time. If you have the inclination, perhaps you could look into this yourself @ammont ;)

ammont commented 10 years ago

@asgrim thanks, I will as soon as I get a little free time.

getriebesand commented 10 years ago

I've switched now to the class from ziege and I like it (@ziege thank you for that class)

HOCD commented 10 years ago

Hi.

I solved my probleme. I made a PHP class that creates new Browscap.ini with only information about browser (ie, firefox, chrome...) and I give the new file to object Browscap.

You must make this: $parser = new ParserBrowsCap(new Browscap(constant("DOCUMENT_ROOT") .'Core/PHP/tmp')); //create object $browscapObject = $parser->getBrowscapObject(); $info_current_browser = get_object_vars($browscapObject->getBrowser()); $parser->deleteFile(); //delete file

I tried with 3 browsers and if you have the time, could you try with your application and tell me the result?

Thank

//////////////////////////////////////////////////////////////// <?php

/*

}

frederikbosch commented 9 years ago

Also had memory problems while updating cache (even with cronjob at the server). Decide to start using https://github.com/yzalis/UAParser.

ve3 commented 9 years ago

I have same memory problem.

frederikbosch commented 9 years ago

@ve3 start using yzalis/ua-parser