Closed toinbis closed 1 year ago
I'm also struggling with this @SimonSapin. I've been trying to traverse and replace Text
nodes but can't seem to overwrite NodeRef
s with something like node = new_node
.
Hi @andrewbanchich, I have managed to craft a working sample. Give me some time, will post it here in the few following hours.
This sounds similar to https://github.com/kuchiki-rs/kuchiki/issues/62. Except that new_p_element
here is a string, so you’d need to parse it first.
Code like node = new_node
only assigns to a local variable and does not mutate the tree.
Code like
node = new_node
only assigns to a local variable and does not mutate the tree.
That's the issue I'm having. There's no way to swap one node with another? #62 describes wrapping one element in another but what if we want to just mutate or replace one element without wrapping it in anything?
I wrote similar, not identical. Please read my comment there and adjust the steps for what you’re trying to do.
More generally, please have a look at the methods on https://docs.rs/kuchiki/0.7.3/kuchiki/struct.NodeRef.html and other parts of the API and consider how you can combine them.
Hi Andrew,
this is an example how to swap one element with another. Source code for main.rs:
use html5ever::{interface::QualName, local_name, namespace_url, ns};
use kuchiki::{traits::*, Attribute, ExpandedName, NodeRef};
pub fn make() -> String {
let text = "
<html>
<head></head>
<body>
<p class='foo'>Hello, world!</p>
<p class='foo'>I love HTML</p>
</body>
</html>";
let document = kuchiki::parse_html().one(text);
let paragraph = document.select("p").unwrap().collect::<Vec<_>>();
for element in paragraph {
let par = NodeRef::new_element(
QualName::new(None, ns!(html), local_name!("p")),
Some((
ExpandedName::new("", "class"),
Attribute {
prefix: None,
value: "newp".to_owned(),
},
)),
);
par.append(NodeRef::new_text("My new text"));
element.as_node().insert_after(par);
element.as_node().detach();
};
document.to_string()
}
pub fn main() {
println!("{}", make())
}
My cargo.toml is as follows:
[package]
name = "kuchikidemo4"
version = "0.1.0"
authors = [""]
edition = "2018"
[features]
stdweb = [ "instant/stdweb" ]
[dependencies]
html5ever = "0.23.0"
kuchiki = "0.7.3"
markup5ever="0.8.1"
The output of cargo run
is:
$cargo run
Compiling kuchikidemo4 v0.1.0 (<..>/rust_projects/kuchikidemo4)
Finished dev [unoptimized + debuginfo] target(s) in 2.71s
Running `target/debug/kuchikidemo4`
<html><head></head>
<body>
<p class="newp">My new text</p>
<p class="newp">My new text</p>
</body></html>
Kindly please let me know if you manage to compile the above code successfully or if you have any questions.
Thanks @toinbis! Here is an example of what I'm trying to get working:
use html5ever::{interface::QualName, namespace_url, ns, LocalName};
use kuchiki::{traits::*, NodeRef, iter::NodeEdge, NodeData};
pub fn main() {
let html = "
<html>
<head></head>
<body>
<p class='foo'>Hello, world!</p>
<p class='foo'>I love HTML.</p>
</body>
</html>";
let doc = kuchiki::parse_html().one(html);
doc.traverse().for_each(|node| {
if let NodeEdge::Start(node) = node {
// if it's text, look for some content and wrap any matches with an element
if let NodeData::Text(text) = node.data() {
let mut new_nodes = Vec::new();
new_nodes.push(NodeRef::new_text("I "));
// add match
let wrapper = NodeRef::new_element(
QualName::new(None, ns!(html), LocalName::from("data-contains-love")),
None,
);
wrapper.append(NodeRef::new_text("love"));
new_nodes.push(wrapper);
new_nodes.push(NodeRef::new_text(" HTML."));
match node.next_sibling() {
Some(sibling) => {
new_nodes.into_iter().for_each(|n| {
sibling.insert_before(n);
})
},
None => {
let parent = node.parent().unwrap();
new_nodes.into_iter().for_each(|n| {
parent.append(n);
})
}
}
node.detach();
}
}
});
dbg!(doc.to_string());
}
I can't create a new parent element because I am not previously aware of what the HTML will look like ahead of time. The code works, but doesn't detach the current node.
This is what I get as a result:
<html><head></head>I <data-contains-love>love</data-contains-love> HTML.<body>\n <p class=\"foo\">Hello, world!</p>\n
<p class=\"foo\">I love HTML.</p>\n\n
</body></html>
Any thoughts on what the issue is?
Thanks!
Hi, @andrewbanchich - I guess you might be interested in checking out https://github.com/cloudflare/lol-html which was released today (more info https://blog.cloudflare.com/html-parsing-1/).
Thanks! I ended up rewriting my code to be a recursive function that just reconstructs the entire tree from scratch and it's working now.
This looks excellent though!
@andrewbanchich @toinbis is this related to #64 ? Would closing that PR solve this issue?
@Ygg01 Yep! If you think my PR is a good solution for this then definitely.
I will soon archive this repository and make it read-only, so this issue will not be addressed: https://github.com/kuchiki-rs/kuchiki#archived
Hi,
I have a working code:
Instead of detaching/remove'ing p element's i'd like to replace them with the element that is defined in
new_p_element
. How would I achieve something likeelement.as_node.replace(&new_p_element)
just with a code which actually compiles?Thanks!