benibela / internettools

XPath/XQuery 3.1 interpreter for Pascal with compatibility modes for XPath 2.0/XQuery 1.0/3.0, custom and JSONiq extensions, pattern matching, XML/HTML/JSON parsers and classes for HTTP/S requests
http://www.benibela.de/sources_en.html#internettools
122 stars 34 forks source link

How to replace Node innerHTML and save it to new string? #22

Open TangMonk opened 4 years ago

TangMonk commented 4 years ago

Is it able to replace node innerHTML, like this:

for node in process(everydayHtmlString, '//table[@id="zhuye" or @id="fuye"]') do
  begin
    if node.toNode.getAttribute('id') = 'zhuye' then
    begin
         node.toNode.innerHTML := aStringContainHTML;
    end;
  end;

and what is the proper way to update everydayHtmlString variable to the latest?

benibela commented 4 years ago

No, once the document is created, it cannot be changed anymore.

The you can use the transform function in xquery to create a new document (needs xquery_json in uses)

process(everydayHtmlString, ' x:transform(., function($node) { if ($node/self::table[@id="zhuye"]) then <table>aStringContainHTML</table> else $node }  )  ');
TangMonk commented 4 years ago

@benibela how about add a setInnerHTML method for Node? It is more convient

TangMonk commented 4 years ago

I am used Winform in C#, there is a libaray called Html Agility Pack, It is really convient and easy to use for parse a web page

benibela commented 4 years ago

I do not want people to believe they can change individual nodes. Then they might try multithreading, change different nodes in different threads and the program fails because you can only modify the entire document.

Every node has an index, first node might be 10, second node 20, third node 33, ...

If you add a node, it needs to update all the node indices afterwards.