spajak / cef-pdf

cef-pdf HTML to PDF utility
MIT License
77 stars 33 forks source link

Added optional remote trigger #8

Closed beckyconning closed 6 years ago

beckyconning commented 6 years ago

cef-pdf is an excellent camera which takes pdf pictures of web pages as soon as it sees them.

However some web pages are not photo ready as soon as cef-pdf sees them.

This pr provides the option to provide a remote trigger to the web pages themselves. Rather than the photo being taken straight away the photo is taken when the web page pulls this trigger.

The web page does this by evaluating window.triggerCefPdf().

An trivial example of such a web page is the following:

<html>
<head></head>
<body>
<script>
    setTimeout(function () {
        var t = document.createTextNode("This is a paragraph."); 
        var p = document.createElement("p"); 
        document.querySelector("body").appendChild(p);
        p.appendChild(t);
        window.triggerCefPdf();
    }, 2000);
</script>
</body>
</html>

In practice most examples will depend on both non trivial asynchronous JavaScript evaluation as well as HTTPS transactions of unknown and varying durations.

spajak commented 6 years ago

Please test your changes with Apache Benchmark tool:

ab -n 200 -c 100 -p "template.html" -T "text/html" "http://127.0.0.1:9288/foo.pdf"
beckyconning commented 6 years ago

Thanks for the review! Will make a commit to address these concerns tomorrow : )

beckyconning commented 6 years ago

@spajak where do I get "template.html" for ab?

spajak commented 6 years ago

It's just sample HTML. Create one

spajak commented 6 years ago

Another solution would be to add only timeout option (--timeout=<sec>), what do you think?

beckyconning commented 6 years ago

A timeout option may well be useful to some but as the http or terminal client can always choose to cancel after a period of time themselves I don't think its necessary.

beckyconning commented 6 years ago

Oh I see what you mean, no that would not work for most use cases and creates potential race conditions in the cases where it does work.

beckyconning commented 6 years ago

The only way to reliably generate a pdf of an asynchronous javascript application is to allow that application to trigger the pdf generation, everything else is guess work against race conditions.

beckyconning commented 6 years ago

And even in the best case scenario it slows down the generation, say the timeout is 30 minutes but the page is ready in 20, thats 10 minutes wasted.

beckyconning commented 6 years ago

Gonna do ab on Release for both devel and this branch merged with devel.

beckyconning commented 6 years ago

devel

This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 100 requests
Completed 200 requests
Finished 200 requests

Server Software:        
Server Hostname:        127.0.0.1
Server Port:            9288

Document Path:          /foo.pdf
Document Length:        16806 bytes

Concurrency Level:      100
Time taken for tests:   10.111 seconds
Complete requests:      200
Failed requests:        0
Total transferred:      3393000 bytes
Total body sent:        59400
HTML transferred:       3361200 bytes
Requests per second:    19.78 [#/sec] (mean)
Time per request:       5055.658 [ms] (mean)
Time per request:       50.557 [ms] (mean, across all concurrent requests)
Transfer rate:          327.70 [Kbytes/sec] received
                        5.74 kb/s sent
                        333.44 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    3   3.6      2      11
Processing:   416 3935 1523.3   4835    5389
Waiting:      415 3935 1523.3   4835    5389
Total:        426 3938 1519.8   4836    5390

Percentage of the requests served within a certain time (ms)
  50%   4836
  66%   5021
  75%   5056
  80%   5086
  90%   5177
  95%   5222
  98%   5298
  99%   5313
 100%   5390 (longest request)
beckyconning commented 6 years ago

trigger-remote (merged with devel)

This is ApacheBench, Version 2.3 <$Revision: 1757674 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 100 requests
Completed 200 requests
Finished 200 requests

Server Software:        
Server Hostname:        127.0.0.1
Server Port:            9288

Document Path:          /foo.pdf
Document Length:        16806 bytes

Concurrency Level:      100
Time taken for tests:   5.393 seconds
Complete requests:      200
Failed requests:        0
Total transferred:      3393000 bytes
Total body sent:        59400
HTML transferred:       3361200 bytes
Requests per second:    37.09 [#/sec] (mean)
Time per request:       2696.278 [ms] (mean)
Time per request:       26.963 [ms] (mean, across all concurrent requests)
Transfer rate:          614.45 [Kbytes/sec] received
                        10.76 kb/s sent
                        625.21 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   0.8      1       2
Processing:   379 2077 738.1   2492    2779
Waiting:      379 2077 738.1   2492    2779
Total:        382 2078 737.3   2492    2779

Percentage of the requests served within a certain time (ms)
  50%   2492
  66%   2583
  75%   2619
  80%   2640
  90%   2707
  95%   2730
  98%   2754
  99%   2758
 100%   2779 (longest request)
beckyconning commented 6 years ago

@spajak I added a more convenient function and documented it. How does this look now? : )

beckyconning commented 6 years ago

How is this? @spajak

beckyconning commented 6 years ago

@anko @spajak is this ready to go? : )