stats4sd / aec_portfolio

A proof of concept for the AEC Consortium Project Management / Assessment System
GNU General Public License v3.0
0 stars 0 forks source link

Add Browsershot #51

Closed dan-tang-ssd closed 1 year ago

dan-tang-ssd commented 1 year ago

This PR is submitted for adding browsershot, which is a package to print web page as a PDF file.

Just created a simple Command class for testing. Below URL shows a spider chart in browser, but it is a login page in the generated PDF file... http://aec.test/admin/organisation/2/portfolio

image

image

portfolio_01.pdf portfolio_02.pdf

The current challenge is how to print web page as a logged in user... I will do more testing and update status.

dan-tang-ssd commented 1 year ago

I was thinking whether I can send the request via HTTP so Browsershot can use the same session... I created a Controller class to generate PDF file, but failed with below error messages found in laravel log file...

It looks strange as I can run the same code in Command class but it throws error in Controller class.... Anyway, seems not worthy to spend time to resolve it at this moment...


[2023-05-15 14:12:47] local.DEBUG: GeneratePdfFileController.generatePdfFile()...
[2023-05-15 14:12:47] local.DEBUG: The command "node ^"C:^\public^\aec^\vendor^\spatie^\browsershot^\src/../bin/browser.js^" ^"^{^\^"url^\^":^\^"http:^\/^\/aec.test^\/admin^\/organisation^\/2^\/portfolio^\^",^\^"action^\^":^\^"pdf^\^",^\^"options^\^":^{^\^"path^\^":^\^"c:^\^\temp^\^\portfolio_01.pdf^\^",^\^"args^\^":^[^],^\^"viewport^\^":^{^\^"width^\^":800,^\^"height^\^":600^}^}^}^"" failed.

Exit Code: 1(General error)

Working directory: C:\public\aec\public

Output:

Error Output:

 Puppeteer old Headless deprecation warning: In the near feature headless: true will default to the new Headless mode for Chrome instead of the old Headless implementation. For more information, please see https://developer.chrome.com/articles/new-headless/. Consider opting in early by passing headless: "new" to puppeteer.launch() If you encounter any bugs, please report them to https://github.com/puppeteer/puppeteer/issues/new/choose.

Error: Could not find Chrome (ver. 113.0.5672.63). This can occur if either

  1. you did not perform an installation before running the script (e.g. npm install) or
  2. your cache path is incorrectly configured (which is: C:\Windows\system32\config\systemprofile.cache\puppeteer). For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration. at ChromeLauncher.resolveExecutablePath (C:\public\aec\node_modules\puppeteer-core\lib\cjs\puppeteer\node\ProductLauncher.js:301:27) at ChromeLauncher.executablePath (C:\public\aec\node_modules\puppeteer-core\lib\cjs\puppeteer\node\ChromeLauncher.js:182:25) at ChromeLauncher.computeLaunchArguments (C:\public\aec\node_modules\puppeteer-core\lib\cjs\puppeteer\node\ChromeLauncher.js:98:37) at async ChromeLauncher.launch (C:\public\aec\node_modules\puppeteer-core\lib\cjs\puppeteer\node\ProductLauncher.js:83:28) at async callChrome (C:\public\aec\vendor\spatie\browsershot\bin\browser.js:84:23)

Just notice that Browsershot can accept HTML content to generate PDF file. image

Um.... will below idea too complicated? If it works, we can almost use it anywhere requires PDF printing.

  1. Develop a controller class, accept URL as function parameter
  2. Send a HTTP request to the URL, store the HTTP response (Can we assume it is using the same logged in session?)
  3. Pass the HTML content to Browsershot, generate HTML
dan-tang-ssd commented 1 year ago

Just resolved the issue that cannot run Browsershot via Controller. It was caused by incorrect cache folder. It should be a Windows specific issue, the solution is to add a puppeteerrc config file in project folder, specify the correct cache folder path.

Incorrect cache folder in error message:

C:\Windows\system32\config\systemprofile\.cache\puppeteer

Correct cache folder to be specified in config file:

C:\Users\DanTang\.cache\puppeteer

Now I can generate the same pdf files via Command and Controller. I will proceed to explore how to get the HTML of a web page with logged in session.

dan-tang-ssd commented 1 year ago

I spent much time on trying Browsershot setCookies() function, but not working... Finally I used a Laravel (um... should be PHP) way to visit a web site with logged in session.

I get the URL that sent request from HTTP header "referer", so the controller is generic for any web page. We can simply add a HTML FORM with a submit button to generate PDF in any web page. I have updated portfolio show page and project show page, it works well.

Regarding PDF file generation, we need to be aware for below items:

  1. The generated PDF content looks a bit different from what we see in browser. Particularly the spider chart looks much smaller in PDF file.
  2. For tabbed page in a web page, no matter which tabbed page you are viewing in browser, only the first tabbed page will be rendered in PDF file.
  3. It would be better to download the generated PDF file to user's computer instead of showing PDF file content directly in browser.
dan-tang-ssd commented 1 year ago

Here are some screen shots for reference:

Portfolio show page image

image


Project show page image

image

dan-tang-ssd commented 1 year ago

To confirm, with this we can add a button on any page to send that page (as the logged-in user sees it) to the generatePdfFile method and get a pdf of it - which is great!

Yes, I confirm.

One note - for some reason, the pdfs that I generate are all very pale - the transparency is set super high and I'm not sure why:

generatePdf.pdf

Which is odd given that your pdf outputs looked fine, colour-wise.

Oops.... maybe we need to run it on Linux and see what will happen...

So there's probably some tweaking needed on the settings, But layout-wise it does look much better than the default "print" option via the browser, so that's good. For now, I suggest that we include this, and then we can tweak the settings / fix the styling of the outputs when we get to the point of having the real pages ready.

Yes. I think it is something that we have to do soon.

I have drafted the chart layout for principles summary. It makes use of flot library stacked bar chart. This is what we used in our company web site project management section. I would like to setup it in aec project, generate pdf file and see how it will look.