Fix text extraction for lexbor

rushter / selectolax

Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).

MIT License

1.11k stars 68 forks source link

Fix text extraction for lexbor #44

Closed rushter closed 3 years ago

rushter commented 3 years ago

In some cases, it segfaults

lexborisov commented 3 years ago

@rushter

Is this a problem on my side?

rushter commented 3 years ago

@rushter

Is this a problem on my side?

I don't think so, I need to check. I don't use text extraction function from lexbor, because it lacks the separator (whitespace or new line) parameter which is very useful for real-world HTML where whitespaces can be added via CSS styles.