gawel / pyquery

A jquery-like library for python
http://pyquery.rtfd.org/
Other
2.3k stars 182 forks source link

getting the text is very slow #254

Open alekssamos opened 11 months ago

alekssamos commented 11 months ago

Hello. First of all, I want to thank the author of this wonderful library, it is very good, I like it a lot and it works much faster than beautifulsoup + lxml, the difference is palpable!. In general, PyQuery is faster than all other existing libraries.

But there is one small problem. $("selector").text() This is done in an average of 10 - 25 MS. I have a huge volume of pages that need to be processed and most of the time the code spends in the function .text()

Is there any way to speed this up? A large number of parallel threads did not solve the problem. Yes, it turns out faster in multithreaded mode than in single-threaded mode, but not by much.

I also noticed that in Python 3.12 the speed is 1% faster than in 3.11