rushter / selectolax

Python binding to Modest and Lexbor engines (fast HTML5 parser with CSS selectors).
MIT License
1.11k stars 68 forks source link

Selectolax couldn't load large html string (87MB) but lxml could #109

Closed bengabp closed 8 months ago

bengabp commented 8 months ago

In my scraper, i am dealing with large html strings and now i have run into an issue with selectolax not able to load my html string which is about 87 mb in size. I tried using lxml and it was able to load it in about 2 seconds.

image image

rushter commented 8 months ago

That limit is artificial; I've increased it. Some people tried to load 5000 MB of binary data by accident and complained about it in the past.

rushter commented 8 months ago

The most recent version can now accept up to 2.4GB of HTML. It will be on pypi in a few hours.

bengabp commented 8 months ago

I am still getting this error even with the update image image