taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.11k stars 107 forks source link

Line breaks are ignored by innerText #249

Closed 1RandomDev closed 1 year ago

1RandomDev commented 1 year ago

When using the innerText function on elements that contain line breaks (<br>) they get ignored in the output. A regular browser will convert them to \n characters. For exmple Hello<br>World will result in HelloWorld, but expected is Hello\nWorld.

zyrouge commented 1 year ago

+1

taoqf commented 1 year ago

I am afraid I cannot see this behavior in chrome. my test code is like this

const div = document.createElement('div');
div.innerHTML = 'Hello, <br>World!';
console.log(div.innerText);
taoqf commented 1 year ago

I add a new branch for this issue, but I don't think we should merge this, because chrome does the same.

1RandomDev commented 1 year ago

I am afraid I cannot see this behavior in chrome. my test code is like this

const div = document.createElement('div');
div.innerHTML = 'Hello, <br>World!';
console.log(div.innerText);

Appears that it only works if the element is added to the DOM. Like this it's working fine.

const div = document.createElement('div');
document.body.appendChild(div);
div.innerHTML = 'Hello, <br>World!';
console.log(div.innerText);
taoqf commented 1 year ago

Merged.

taoqf commented 1 year ago

And thank you for feeding back.