What steps will reproduce the problem?
1. Let's retrieve an external HTML page in ISO-8859-1 with accents :
$url = 'http://www.meretmarine.com/article.cfm?id=119036';
require_once('phpQuery.php');
function success1($browser) {
$code = pq($browser);
$title = pq('head title', $code);
$html = pq('html', $code);
return $browser;
}
$zhc = phpQuery::browserGet($url, 'success1');
2. Let's display $title->text(), $title->html(), $html->text() and
$html->html().
What is the expected output? What do you see instead?
For ->text() you can see the accents are preserved. But when using ->html() the
accents disappear. To get just a text that's not a problem, but if you want to
deal with a part of the html code (for example I want to get the html code from
a specific div and split the content by every <hr> and then deal with each
part) the loss of the accents is a problem.
What version of the product are you using? On what operating system?
I'm using 0.9.5 on Ubuntu 11.10.
Original issue reported on code.google.com by sica...@gmail.com on 16 Mar 2012 at 4:44
Original issue reported on code.google.com by
sica...@gmail.com
on 16 Mar 2012 at 4:44