place some GBK charecters before <html
then newDocumentHTML or any other functions will give a pq object with wrong
encoding, so later methods like pq('title')->text() give bad results
when dealing with the problem, I have to use the substr of the page with start
index to be index of <html
without GBK before <html, everything is right
Original issue reported on code.google.com by huangs...@yoka.com on 7 Jul 2010 at 10:01
Original issue reported on code.google.com by
huangs...@yoka.com
on 7 Jul 2010 at 10:01