tburry / pquery

A jQuery like html dom parser written php.
GNU Lesser General Public License v2.1
135 stars 25 forks source link

Script dying in parseStr #10

Open quinxy opened 8 years ago

quinxy commented 8 years ago

For several hours I thought I might be going crazy, and then I realized pQuery is die-ing/exit-ing under some unknown conditions while executing parseStr(). About 1 in 5 times (a guess) I call it on a web page (fetched via file_get_contents()) it will kill my script.

The entirety of the pQuery-related code I'm calling is:

$html = file_get_contents($url); // It always reaches here $dom = pQuery::parseStr($html); // It sometimes does NOT reach here $html2 = $dom->query('div[class="col-xs-12 col-sm-6 col-md-5 col-lg-4"]')->html(); $dom2 = pQuery::parseStr($html2); // I'm doing this second parseStr because I can't find an equivalence to jquery $('#foo').find('#bar') $imageUrl = $dom2->query('img')->attr('src');

I do not have a sample of the HTML which causes the dying... I'll try to collect that later and add it. The actual contents seem like they might not prove all that relevant because I'm basically working through a list of URLs and when I retry (after the script death) it works fine with the same URL (it isn't stuck on that URL) and with the pQuery code disabled the script works fine, so it's not a case where the $html it is given is ever "" or null.

tburry commented 8 years ago

Do you have any other information about this? When things randomly die like this my guess would be memory limit. You are looping through a list and eventually the memory goes too high. Running with error reporting and/or an xdebug enabled version of PHP might shed more light.

If you can get me any more information I'll do my best to hunt this down.