file_get_dom, runs forever

GoogleCodeExporter commented 9 years ago

What will reproduce the problem?
$html = file_get_dom('http://www.nhl.com/ice/schedulebyseason.htm');

What is the expected output? What do you see instead?
After taking more than 30 seconds and triggering a fatal error many times, I 
set `set_time_limit(0);`. It has been ongoing since for about 15 minutes.
"Fatal error: Maximum execution time of 30 seconds exceeded in 
C:\xampp\htdocs\hockey\ganon.php on line 238"

Which version are you using?
Ganon single file PHP5 (rev. #78)
PHP 5.4.7

Please provide any additional information below.
It worked with the examples provided, 'code.google.com'

Original issue reported on code.google.com by jonathon...@gmail.com on 8 Mar 2013 at 1:24

GoogleCodeExporter commented 9 years ago

This library is implemented using only PHP. Performance for big pages can 
definitely be improved. Feel free to contribute patches!

Thanks for you report, sorry for the late response.

Original comment by niels....@gmail.com on 7 Apr 2013 at 3:15

Changed state: WontFix
Added labels: Type-Enhancement, Priority-Low
Removed labels: Type-Defect, Priority-Medium

GoogleCodeExporter commented 9 years ago

Well it doesn't really matter to me, I was not and am not here to complain. I 
have moved on to a web scraper that works for me.

Other web scrapers, built only in php, scrape the same page in under a second 
so I do not think it is performance, per se, that is the problem (and it is not 
that big). I believe it is far more likely that it is in an infinite loop.
Note: PHP Simple HTML DOM Parser has a problem with the page as well, but 
instead of running indefinitely seems to return null.

Original comment by jonathon...@gmail.com on 7 Apr 2013 at 3:28

ressio / pharse

file_get_dom, runs forever #32