magicdice / phpquery

Automatically exported from code.google.com/p/phpquery
1 stars 3 forks source link

XML and HTML conversions #85

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
 * For HTML document xml() returns it's XML version.
 * For XML documents html() returns it's HTML version.
 * For both HTML and XML markup() returns native document's version.
 * Same for xmlOuter(), htmlOuter() and markupOuter()

Original issue reported on code.google.com by tobiasz....@gmail.com on 7 Dec 2008 at 12:58

GoogleCodeExporter commented 9 years ago
I'd be happy to do some in depth testing and bug reporting on this issue as 
it's a
prime concern for my use of phpQuery (which I intend to continue for some time 
to
come :) ). Please let me know if there are specific things I can do to help
test/develop this issue.

Original comment by bigbluehat on 13 Nov 2009 at 3:33

GoogleCodeExporter commented 9 years ago
Continuing discussion from issue 133, consider following code:

$pq1 = phpQuery::newDocumentHTML("<p><br><a href='/?foo=1&bar=2'>href</a></p>");
$pq2 = phpQuery::newDocumentXML("<root/>");
$pq2['root']->append($pq1);
print $pq2;

It means that node level conversion is perfectly possible using already 
implemented 
injection methods.

New methods should probably be implemented using this approach also. You're 
very 
welcome to give yourself a try with them if you want. Only important factor to 
have 
on mind here is leaving the possibility of easy charset conversion to be 
implemented 
in the future. If you have any questions, feel free to mail me directly or on 
google 
group.

Original comment by tobiasz....@gmail.com on 14 Nov 2009 at 12:08

GoogleCodeExporter commented 9 years ago
Thanks so much for the quick reply. Here's what I've implemented from your
recommendation and it works like a charm:

$t = phpQuery::newDocument($template_html);
// ... manipulation code ..
$final = phpQuery::newDocumentXML('<root/>');
print $final['root']->append($t)->html();

This worked perfectly. It output just the contents of <root/> and allowed the 
page to
validate. I've not done speed tests at this point, but I'd assume it's quite a 
bit
faster as there's a good bit less parsing going on.

Thanks for the tip. I'd think (while maybe a hack) this it'd be easy enough to
implement somewhere within phpQuery.

If I can help, please let me know. I really enjoy phpQuery and look forward to 
using
it more.

Thanks, again for the quick help.

Original comment by bigbluehat on 17 Nov 2009 at 8:00

GoogleCodeExporter commented 9 years ago
New issue (which I'll file), but it's related to this fix. Numeric entities (and
maybe others) are being translated into their character equivalents and output 
as
characters--which breaks validation. Renders fine, though.

Original comment by bigbluehat on 17 Nov 2009 at 8:25

GoogleCodeExporter commented 9 years ago
Issue 137 has some sample code and a little more test result info.

Original comment by bigbluehat on 17 Nov 2009 at 8:42

GoogleCodeExporter commented 9 years ago
Another issue related to this hack is that textarea fields end up being "open" 
if
they're empty.

Meaning, if code is added to the HTML that contains something like this:
<textarea id="comments"></textarea><div class="button"><input....</div>

That gets turned into:
<textarea id="comments"><div class="button"><input....</div>

Which, of course, means the rest of the document is loaded into the textarea 
rather
than as normal HTML.

I'm still hunting down a work around.

Original comment by bigbluehat on 25 Nov 2009 at 3:35

GoogleCodeExporter commented 9 years ago
I'm afraid this is really hacky, but it does the trick:

// $t is the (X)HTML phpQuery object of the processed template
$final = phpQuery::newDocumentXML('<root/>');
$really_final = phpQuery::newDocumentXHTML($final['root']->append($t)->html());
print $really_final;

Basically, re-running it through phpQuery a third time got things into the
destination format (XHTML).

Not ideal from a speed perspective, but the page displays correctly. :)

If you can point me in the right direction phpQuery code wise, I'd be happy to 
help
fix this bug. phpQuery's increasingly part of my set of tools.

Thanks.

Original comment by bigbluehat on 25 Nov 2009 at 3:44

GoogleCodeExporter commented 9 years ago
Found a new sub-issue related to creating XHTML this way: inline scripts get
<![CDATA[ and ]]> added to them automatically.

This, at least for me, is causing JavaScript syntax errors. There are already 
CDATA
"tags" in the form of
// <![CDATA[
and
// ]]>
which are more "friendly."

Not sure yet how to get around this, but ideally, it'd be nice not to have the
additional CDATA's added at all.

Thoughts?

Original comment by bigbluehat on 9 Mar 2010 at 9:42