rennat / pynliner

Python CSS-to-inline-styles conversion tool for HTML using BeautifulSoup and cssutils
http://pythonhosted.org/pynliner/
180 stars 93 forks source link

Speedup with lxml and tinycss+cssselect #29

Closed kevinastone closed 10 years ago

kevinastone commented 10 years ago

Not sure if this is an interest to you. Our pynliner calls were taking ~30s to execute (complex css from boostrap and others). I replaced beautiful soup with lxml and mostly substituted cssutils for a combination of tinycss and cssselect. All of which have C modules for speedup. The end result is inlining goes down to about 5 seconds for a 6X improvement.

The code would need some real cleanup, it's mostly just substituting lines with different library calls right now. If you think there's a benefit to try to merge this, let's talk. Otherwise, no problem, I'll just keep maintaining this fork for our specific use.

rennat commented 10 years ago

I like the direction you are going and I have long term goals of speeding this tool up but when I just created an environment and ran the tests from this branch, many failed.

...
Ran 59 tests in 0.112s

FAILED (failures=17, errors=19)
...

We also need to add some tests that confirm this works with HTML and XHTML. We currently support both with beautiful soup and I would hate to give that up.

kevinastone commented 10 years ago

Yeah, it's certainly not ready to merge. This was just a quick wholesale replacement so I could use faster parsers for my specific use case. Depending on your appetite, I would encourage some sort of pluggable backend architecture where you could swap css and html parsing engines depending on needs (say trading accuracy for speed).