philipp-winterle / crittr

High performance critical css extraction with a great configuration abilities
https://hummal.github.io/crittr/
GNU General Public License v3.0
51 stars 6 forks source link

benchmark: also run Penthouse execution in parallel #3

Closed pocketjoso closed 4 years ago

pocketjoso commented 6 years ago

Hey! While the code looks clean from what I've seen so far, the benchmark has a bug. You were running Penthouse calls in sequence, and the other libraries in parallel.

screen shot 2018-07-07 at 19 43 13

With this change, on my machine Penthouse is now faster than crittr... Perhaps the benchmark is not representative of the performance gains your library has made (i.e. too simple). In either case, would appreciate if you updated your chart and communication since it's not so far accurate.

Cheers

pocketjoso commented 6 years ago

Made the same fix for the criticalcss lib, which also reduce the timing for that library significantly (although it is, unlike Penthouse, afaik not at all optimised for parallel execution).

screen shot 2018-07-07 at 20 03 52

Final note regarding the benchmark - not sure critical fits in there. It uses penthouse for the critical css generation, which both means it's in a way a duplicate, but also that the current setup likely is flawed, as execution timings should at best be comparable to Penthouse (not faster).

philipp-winterle commented 6 years ago

Iam not sure about that. I want to do an exact comparision. This means one call of per max amount of urls you can deal with. In your provided example you are starting multiple penthouse sessions at one with a new browser every time. Am I right? That is not what I want to compare. Crittr needs to run 1 process for all urls and handles it internally. Penthouse, as every other lib, can only handle one url per process.

So the real comparisson is to process one after another. Crittr is a power user tool. Penthouse can't handle that much urls at the same time. Thats why I wrote crittr. If I would (as I had before) use your example in my real projects penthouse would crash. It can't handle it. Also tried penthouse with pre created puppeteer browser. Same. But I will try again with your new version.

@critical has a different approach and user dont always know penthouse if they use critical. Also a wrapper around a library with speed x should always be slower than x.

pocketjoso commented 6 years ago

I want to do an exact comparision.

Me too 👍. I think the problem here is that you have branded it as being about execution time, but indeed as you said your focus was on was to avoid crashed during heavy workloads, i.e. stability. Penthouse has actually focused a lot on execution time, so it's not so surprising to me that it's fast(er). I think you would be better of thinking of a different benchmark, if one suitable can be found at all.

Regarding execution time however:

This means one call of per max amount of urls you can deal with. In your provided example you are starting multiple penthouse sessions at one with a new browser every time. Am I right? That is not what I want to compare. Crittr needs to run 1 process for all urls and handles it internally. Penthouse, as every other lib, can only handle one url per process.

No, this is not right, Penthouse does re-use the same browser (maybe it didn't when you last used it, although it's been in master for a long time now). For concurrent jobs, they are processed in different browser tabs. Chrome AFAIK runs each tab in a separate process, so what you said about processes is not true either.

Maybe I can clarify here. Yes, your library is setup so you can feed it multiple URLs at once. Penthouse is not, you have to call it multiple times. It might seem weird since for regular builds you have all the urls available from the start. And it might seem like it would be slower. For me, a very important use case is to be able to have great performance when starting jobs on the fly, for a service like http://jonassebastianohlsson.com/criticalpathcssgenerator. This is why I share state between different Penthouse calls, and why I've spent a lot of time working on having great execution times, with this use case in mind. I could change the API to allow for a urls parameter, but the saving that would yield would be absolutely negligible (compared connecting to websites, opening browser tabs, taking screenshots (in my own use cases)...).

As I suggested above then I don't think execution time is a good fit for a benchmark for critter - it's not where your library will shine. If you do want to keep it though, to make it fair I think you would have to do something like:

I still suspect that this benchmark is just not a good fit for critter. Wdyt?

pocketjoso commented 6 years ago

@critical has a different approach and user dont always know penthouse if they use critical. Also a wrapper around a library with speed x should always be slower than x.

I agree. But in your benchmark it's ridiculously fast, probably because you're finishing the test before the execution is actually done.

pocketjoso commented 6 years ago

Just did a quick test setting concurrentTabs: urls.length for critter to get a more fair comparison. It got closer to Penthouse, but still slower..

screen shot 2018-07-07 at 21 34 58
pocketjoso commented 6 years ago

Ran a few more tests with a single url, with 10, and with 40 (still concurrentTabs: urls.length for critter). Penthouse is always faster. Now the benchmark is a bit of an artificial example with dummy local html.. but it's not in favour of critter. (I should say that this will also depend on your machine)

screen shot 2018-07-07 at 21 37 57
pocketjoso commented 6 years ago

I do think your library looks well designed from what I've seen - don't take me wrong here. I'm just talking about the benchmark. :)

pocketjoso commented 6 years ago

Changed the benchmark to run vs real urls (dn.se, theguardian.co.uk, stripe.com, jonassebastianohlsson.com) ... same story, penthouse is faster.

screen shot 2018-07-07 at 21 53 37

Also noted that critter crashes/times out for some urls that penthouse handles. I don't know if you care, but yeah I've also spent a lot of time on this over the year, loading weird sites (I know, unlike your exact use case, which I have not optimised for). One example that works in Penthouse but not in critter: https://www.forbesindustries.com/

philipp-winterle commented 6 years ago

Thanks for you feedback. I did some of the changes you stated above. As seen in your screenshots I got the same node warning. Do you think it is good to ignore this warning? I mean it seems that penthouse is not designed to work as you said above. Nevertheless if I put penthouse in parallel as you stated it runs faster. But I wouldn't use it if it shows me memory leak warning. That is not a use case for me. You should get rid of these warnings by not just raising the max event listener amount to Infinity (or urls.length). I remember that I encountered this problem while working with penthouse. You also can not use your example to deal with all urls given. Imagine 100 urls given. You system will be non responsive while having 100% load. Concurrency is the key.

The benchmark however was made with version 1.0.0. In 1.1.0 are more features included like the return of the graceful calculated remaining css. That's the main reason for being slowed down I guess. If I use your example in fact penthouse runs faster right now. I have around 1.8s vs 2.8s. But in my opinion this is using penthouse what it is not designed for. That was the reason for me to write crittr.

The main argument why your example is not completely compareable: If I give 10 urls and 1 css file into crittr it gives me the delta (nearly - subselector matches incoming) of all urls critical css. Means: If we want to have a equivalent comparisson I need to add the process of building the delta of all css returns before ending the timer of penthouse. Which is clearly O(n²) for the size of the css file and the length of urls. Penthouse just gives me the critical css for every single url. I will have 10 (maybe) different css files after the execution. If I use penthouse this way I cant use the css immediately after. I need to merge all 10 critical css files together and remove duplicates from. You can imagine the effort of this task. This takes the most performance in crittr and it is still in development to produce better results. Also the more urls and larger css you have to process the longer this process takes and the more crittr will shine.

I was building a test for your website and found that you also use just one css file for all pages on your site. This is a very good example why to use crittr 👍. I take your css and put in all your urls. You get the critical path for all of you pages and the remaining css aswell. No need to include the full css in every page. No need to calculate the critical css for every page.

Also noted that critter crashes/times out for some urls that penthouse handles. I don't know if you care, but yeah I've also spent a lot of time on this over the year, loading weird sites (I know, unlike your exact use case, which I have not optimised for).

I don't get a single crash for any url. Even for https://www.forbesindustries.com/. It works flawlessly. Maybe you had problems with your internet connection or carriers not working correctly. maybe different puppeteer instances interrupting each other (penthouse vs crittr chromium).

Btw: I made the test with https://www.forbesindustries.com/ and found out penthouse needs 2.5s while crittr needs 5.6s which is a lot more. After visiting the site multiple times in my browser I found out that the pageload event is fired around 3.5 seconds. Are you sure you wait 'til the page has fully loaded? I will wait til pageload to ensure having the wanted page layout.

But I understand your concerns about the benchmark results. If you can provide an example which fits my use case I will update the chart.

So please update the pr or close it if you don't like to update.

pocketjoso commented 6 years ago

Hi again. Just for context since it's been a while since we discussed before. I run tens of thousands of critical css generation jobs daily, I have detailed stats of the execution of penthouse, both in terms of accuracy and timings. I have been using queues, screenshots, css merging, de-duping, minifying etc for years. The reason it's not baked into penthouse is because I prefer the UNIX pholisophy of modular parts. It's easy to combine penthouse with another library (for a queue, minifying etc). They can focus on making their library great, and you can focus on the thing your library does specifically. By not including more in penthouse I can also keep it simpler and smaller. Enough of about that. :)

As seen in your screenshots I got the same node warning. Do you think it is good to ignore this warning? I mean it seems that penthouse is not designed to work as you said above.

I would advise anyone that wants to run penthouse with many urls to implement a simple queue, as I have linked to here before. I already provided the reasons I didn't implement a queue inside penthouse. I have however lately given more thoughts about that, and considered actually implementing it just for the sake of setting a default cap on nr of concurrent jobs, so it "just works" for people who expect it to when throwing concurrent jobs at it.

Nevertheless if I put penthouse in parallel as you stated it runs faster. But I wouldn't use it if it shows me memory leak warning. That is not a use case for me. You should get rid of these warnings by not just raising the max event listener amount to Infinity (or urls.length). I remember that I encountered this problem while working with penthouse. You also can not use your example to deal with all urls given. Imagine 100 urls given. You system will be non responsive while having 100% load. Concurrency is the key.

It's not difficult handling 100 urls, since we are just talking about setting a limit of max concurrent jobs. Any queue can handle that, like my example queue with almost no code at all. I remember when you were working with Penthouse last that I was telling you to do so, and provided you example code for it. This was before I added the examples to the repo, which I added because I saw that they were needed after talking to you, so thanks for that! :)

The main argument why your example is not completely compareable: If I give 10 urls and 1 css file into crittr it gives me the delta (nearly - subselector matches incoming) of all urls critical css. Means: If we want to have a equivalent comparisson I need to add the process of building the delta of all css returns before ending the timer of penthouse. Which is clearly O(n²) for the size of the css file and the length of urls. Penthouse just gives me the critical css for every single url. I will have 10 (maybe) different css files after the execution. If I use penthouse this way I cant use the css immediately after. I need to merge all 10 critical css files together and remove duplicates from. You can imagine the effort of this task. This takes the most performance in crittr and it is still in development to produce better results. Also the more urls and larger css you have to process the longer this process takes and the more crittr will shine.

I'm sorry, but this is exactly what I think it's a bad idea spending engineering time on. You are re-inventing the wheel... there are plenty of libraries that do this, and no it's no O(n²), a library like clean-css takes milliseconds to merge and dedupe css. You should try it - I've been a happy user for a long time - it's just not baked into penthouse.

I believe we have already discussed critical css for individual pages vs combining. To refresh:

I was building a test for your website and found that you also use just one css file for all pages on your site. This is a very good example why to use crittr 👍. I take your css and put in all your urls. You get the critical path for all of you pages and the remaining css aswell. No need to include the full css in every page. No need to calculate the critical css for every page.

I also think we already discussed, but if not I strongly recommend against the "remaining css" thinking. It's bad practice, because:

I don't get a single crash for any url. Even for forbesindustries.com. It works flawlessly. Maybe you had problems with your internet connection or carriers not working correctly. maybe different puppeteer instances interrupting each other (penthouse vs crittr chromium).

Hmm yeah today it does work for me from both crittr and penthouse, so I guess the site just has unpredictable loading times.

Btw: I made the test with forbesindustries.com and found out penthouse needs 2.5s while crittr needs 5.6s which is a lot more. After visiting the site multiple times in my browser I found out that the pageload event is fired around 3.5 seconds. Are you sure you wait 'til the page has fully loaded? I will wait til pageload to ensure having the wanted page layout.

Penthouse waits long enough to get the right layout, I think you looked at this code inside penthouse before. You don't necessarily have to wait until page load. Also note that I left the js enabled for penthouse inside your benchmark to make it more comparable, but I don't see why you are running with that setting? It slows you down, and whatever JS does on your site happens after the first render and this has no impact on the critical css.

But I understand your concerns about the benchmark results. If you can provide an example which fits my use case I will update the chart. So please update the pr or close it if you don't like to update.

Hmm it's not important to me to close this PR specifically, I raised it to help you squash what I saw as a bug. What is your "use case"? This is what I asked before, answering that would make this easier for you. Given that your library is not faster I would suggest again that you focus on something else. You could go for something like that it's more end-to-end..

As for the benchmark specifically, what are you thinking would be better? It is not very representative as it is in master, would you rather update it for a more fair comparison, potentially with more text to explain, or remove it and focus on something else?

philipp-winterle commented 6 years ago

you will end up creating unique "remaining css" for every page, breaking the most important job for your full css - to be cached and re-used. Hence it's a bad idea to try to extract critical css.

Nope. The remaining css is still the same for every page due to being a complete critical and remaining css for a whole website. And its working with pagespeed 100% for a really big amount of pages (e.g. 35k pages - broke down to 27 page types aka 27 different css files with big size (1mb+) -> Cached!)

Also note that I left the js enabled for penthouse inside your benchmark to make it more comparable, but I don't see why you are running with that setting? It slows you down, and whatever JS does on your site happens after the first render and this has no impact on the critical css.

Are you never visited a website that reloads their non critical css via Javascript? Or a website using javascript to layout details in the critical path? Unfortunately this is really common and Iam going to deal with that.

there are plenty of libraries that do this, and no it's no O(n²), a library like clean-css takes milliseconds to merge and dedupe css. You should try it - I've been a happy user for a long time - it's just not baked into penthouse.

This is still dependent on your amount of css. The more css the more time it needs aswell. And with this approach maybe penthouse would never be created. If you don't want to make things better and rely on already invented wheels we wouldn't have cars with electric engines. Anyway I totaly forgot about clean-css and agree with you that this library is a good choice. Iam using it in many other projects but never thought about it in terms of merging. Good idea!

Build it into crittr and measured only the parts where the different methods are used:

About time in generel: I still think I time diff of less then 10 seconds is not important to create a static css for a website. You don't do that in a period of time or on every user access where time is crucial. It is a task before every release of a new website version. If you need it in between it is still not important if a lib finished in 4 or 5 seconds. I want to show that crittr is faster as penhouse in doing its work with additional features and what it is build for and this is still valid. I can say that because since I switched from penthouse to crittr my time for the build process slows down from 9.5 to 4.6 mins. The only change was penthouse - crittr. But the times in the chart are not correct any longer. I need to make a new chart. Maybe I will add more information how I measured times. Maybe I will add the enterprise example aswell to show the power user impact on using both tools. But I will think about removing the time compare completely. Maybe you are right. The target of both is different. Maybe I go with a simple feature compare.

philipp-winterle commented 6 years ago

Did the whole test with clean-css and it resolves in O(n²) for needing 340s to minify the remaining css fo 14 urls with a source css of ~1Mb filesize. FYI

pocketjoso commented 6 years ago

Did the whole test with clean-css and it resolves in O(n²) for needing 340s to minify the remaining css fo 14 urls with a source css of ~1Mb filesize. FYI

Whoa ok, sorry then seems like you were right about this. I just run it on critical css files, and on average it takes about 10ms for me.

pocketjoso commented 6 years ago

you will end up creating unique "remaining css" for every page, breaking the most important job for your full css - to be cached and re-used. Hence it's a bad idea to try to extract critical css.

Nope. The remaining css is still the same for every page due to being a complete critical and remaining css for a whole website. And its working with pagespeed 100% for a really big amount of pages (e.g. 35k pages - broke down to 27 page types aka 27 different css files with big size (1mb+) -> Cached!)

I still don't understand what you're saying here, or your use case. I think this lack of understanding has been the real reason we've been miscommunication from the beginning. It still seems to me that what you are doing is at least not the standard way of using critical css. I think to get adoption of your library you would really benefit from more clearly explaining what problem you are trying to solve, and how you are solving it. I understand that you're somehow considering the whole site at once rather than individual pages, like Penthouse does, but exactly how this works and why this is beneficial I don't get.

Also note that I left the js enabled for penthouse inside your benchmark to make it more comparable, but I don't see why you are running with that setting? It slows you down, and whatever JS does on your site happens after the first render and this has no impact on the critical css.

Are you never visited a website that reloads their non critical css via Javascript?

What do you mean by this?

Or a website using javascript to layout details in the critical path? Unfortunately this is really common and Iam going to deal with that.

Since the only way to benefit from using critical css is by moving all your javascript out of the critical path, any changes they would be making to the page would happen after the first render, so the critical css should not take them into account. Sure, a site owner could chose to leave some script tags with inline JS in the head of the document, that could modify the page - but this is non conventional and very rare, and not something I would make an exception for at the library level. The user with these particular needs can change the settings for themselves (or just move this JS to execute after the first render).

I want to show that crittr is faster as penhouse in doing its work with additional features

I think you should just focus on your additional features instead. As we have said, without them your library might be about as fast, but with them you are slower, which is natural. Why focus on the speed? You even think that it's less important for your use case. For me and Penthouse, on the other hand, I consider it very important.

I can say that because since I switched from penthouse to crittr my time for the build process slows down from 9.5 to 4.6 mins

You can totally talk about this, but you have to be clear in how you used Penthouse. I suspect that this speed gain you are talking about has nothing to do with Penthouse's critical css generation (which as we said earlier, is about the same speed, or faster, than your library), but rather in how:

I would guess that running penthouse with a queue like in my simple example, with something like your css merging and de-duping (non existing feature in Penthouse) would most likely make penthouse faster than crittr at exactly your use case. The only thing I want to show there is that penthouse is fast, the only comparison you can make is that you have added more features and you can execute in a certain time with them. You can't compare that time to Penthouse (other than saying ~ the time it takes for critter to run this additional features is about X).

philipp-winterle commented 6 years ago

I still don't understand what you're saying here, or your use case. I think this lack of understanding has been the real reason we've been miscommunication from the beginning. It still seems to me that what you are doing is at least not the standard way of using critical css. I think to get adoption of your library you would really benefit from more clearly explaining what problem you are trying to solve, and how you are solving it. I understand that you're somehow considering the whole site at once rather than individual pages, like Penthouse does, but exactly how this works and why this is beneficial I don't get.

You are right. Iam not using the standard way to generate critical css because in my opinion there are only rare cases where a website has no changing of critical path with libraries etc ( think of wordpress pages + plugins). Lots of people using libraries on their websites they don't care what they are doing. But they still want the pagespeed to be as high as possible and they head about critical css. What crittr wants to do is to gather the critical css of all the pages of a website where this website has one css for and reduce the amount of the critical css to the minium to fit for all of the pages mentioned before. How does this work? I had a problem at work once. I need to get the SEO rankings higher with the main impact in having a higher pagespeed for a rich internet application. This application Iam talking about needs to manage ~35k pages split down to 24 different page types. Thinking of it as a cms. I want to get a good way to automatically detect the critical css for those page types. Finally I got 24 critical css files and remaining aswell. I use them to raise the pagespeed value to 100. So you may be right if you say its maybe not the standard way to generate critical css. But I think we need different approaches because there is not many standard websites that fits into the standard way of creating cricical css. In business environment you are often dependent on external apis which often needs you to use their scripts. This way I realized that there is a urgent need for something like crittr. I already did it before crittr with penthouse. But to achieve human build times for 35k pages I need to be parallel and fast. Until the release of crittr I used penthouse 0.9 because phantom js does not crash on parallel usage. I tried every penthouse version above which leads to my PR to your code to let me work with the newest version. The puppeteer versions on your side keeps crashing and keep some of my pages from being loaded(even with a high timeout of 90s. Crazy isnt it?). I can't really tell what is wrong but with crittr I focused on beging healthy, stable and ... lets say as fast as possible. After crittr is included in my project it runs smooth and 2 times faster than before which lowers the time of our build process significantly.

Are you never visited a website that reloads their non critical css via Javascript? What do you mean by this?

You heard about loadCss? You can async load css files after pageload. Real world problems 😍

Since the only way to benefit from using critical css is by moving all your javascript out of the critical path, any changes they would be making to the page would happen after the first render, so the critical css should not take them into account. Sure, a site owner could chose to leave some script tags with inline JS in the head of the document, that could modify the page - but this is non conventional and very rare, and not something I would make an exception for at the library level. The user with these particular needs can change the settings for themselves (or just move this JS to execute after the first render).

If that happens in your world, then congratulations. In the world Iam living this is not very rare. I stumbled over so many websites having the exact problem you are describing. Do you know how many wordpress websites out there? They all lack optimization. Even with those fancy plugins.

I was rethinking about our talk and I will remove the time benchmark and just do a feature compare. It is really hard to compare by time because I would need to use my merging and apply it to your files afterwards. And this is waste of time. ❤️

pocketjoso commented 6 years ago

First of all, thanks for your detailed explanation of your objectives with your critical css generation, it finally starts to make some sense to me. 👏

Just to clarify again I want to say that I am involved in the wordpress scene for critical css generation and have been for a while. My approach though has been very different from yours. Instead of relying on a build process I have dynamically detected and re-generated new critical css on the fly, when needed. This removes the need to run the critical css generation step for 35k pages at once - but I do understand that this does not fit well for those who only use wordpress as a CMS and have their own proprietary build step.

So the only thing that remains here now I think is about running JS together with critical css. You said:

In business environment you are often dependent on external apis which often needs you to use their scripts.

The thing is that you cannot leave any (external) scripts on the critical path, as soon as you do you get 0 benefits of using critical css. Javascript becomes the bottleneck, the first render only happens after JS has been requested, downloaded, parsed and executed - no speed benefit. So I hope you're not planning to implement critical css while leaving render blocking JS on the page... because it's just a wasted effort, there's no performance improvement.

You heard about loadCss? You can async load css files after pageload. Real world problems 😍

Well yes, I'm well familiar with loadCss. Async CSS loading patterns like these all support Non-JS via a fallback, so this css will load even if you disable JS.

If that happens in your world, then congratulations. In the world Iam living this is not very rare. I stumbled over so many websites having the exact problem you are describing.

I run critical css generation for a huge number of wordpress websites. I can't remember if I have ever seen a site that depended on JS (just) for CSS loading - that is not what happens. The sites that depend on JS do so completely for rendering, i.e. they show no content until JS has loaded. They are definitively larger in numbers, but these sites cannot benefit from critical css generation. Whether you have inlined critical css or not, as long as you're leaving render blocking JS on the critical path, the first render will be exactly as slow as before.

philipp-winterle commented 6 years ago

think of a gallery library which does not care about critical path. Element is in ATF and the layout is not done before the gallery initializes. If there only classes set on elements you want this css in the critical. If elements are added or changed then you have more css as you need. The reality is that one is not always checking for best patterns. Iam totally aware of the things you are saying but Iam convinced most of the peoples are not. I guess it is not only about full performance but more on best performance a page can achieve without to much knowledge needed.

Iam not talking about pages loaded only by js. I hope no one is doing that 🗡

pocketjoso commented 6 years ago

For your gallery example, I'm not sure I follow. The original markup for the page is empty in terms of the gallery, as it is initialised by JavaScript. Are you saying you want to put the gallery styles in the critical css in case JavaScript initialises before the full css has loaded? Otherwise I don't follow what you mean... If it is what you mean then... if you are looking for a fully automatic solution, maybe what you're suggesting is a decent option. I would always suggest to instead hide the gallery until both JS and full CSS has loaded (hide in critical css, unhide in full css, after JS has loaded), but it does require manual styles written for each site.

Iam not talking about pages loaded only by js. I hope no one is doing that 🗡

Single Page Applications are still a thing, but the pattern that are terrible for performance and most common on Wordpress that I see are:

philipp-winterle commented 6 years ago

Updated the README. benchmark.js will stay in examples just for example use case.

pocketjoso commented 6 years ago

Updated the README.

Great

benchmark.js will stay in examples just for example use case.

Hmm.. I suggest either rename it to example, and delete the other libraries, or if you want to keep it named benchmark, accept this PR so you don't have a misleading way of showing of other libraries.