chatnoir-eu / chatnoir-resiliparse

A robust web archive analytics toolkit
https://resiliparse.chatnoir.eu
Apache License 2.0
80 stars 11 forks source link

svg caused lexbor to crash #39

Closed prnake closed 6 months ago

prnake commented 6 months ago

MRE:

from resiliparse.parse.html import HTMLTree
str(HTMLTree.parse("<svg><template>\n"))

It causes segmentation fault, and trace show that is was caused by lxb_html_serialize_node_cb.

CC @lexborisov

prnake commented 6 months ago

I think the problem has been sloved by https://github.com/lexbor/lexbor/commit/0c20fa99e3bf41b523a475212f747480fab848d0 proposed in https://github.com/rushter/selectolax/issues/91 , but resiliparse using a much older version.

phoerious commented 6 months ago

Thanks. I've updated the dependencies and tested that it works now. Just waiting for the Docker images to rebuild, then I'll publish a new tag.