[Feature] Make PDF testing idiomatic

microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

https://playwright.dev

Apache License 2.0

65.47k stars 3.56k forks source link

[Feature] Make PDF testing idiomatic #7822

Open mxschmitt opened 3 years ago

mxschmitt commented 3 years ago

Customers are confused, when a PDF results in a PDF viewer and when in a download event. We should explain how to workaround it in the relevant browsers.

7830
6091
3509
3365
6342
https://github.com/microsoft/playwright/issues/20633

To make it work out of the box the following changes are required:

Chromium
- headed: headed plugin override through CLI: https://source.chromium.org/chromium/chromium/src/+/main:chrome/browser/plugins/plugin_prefs.cc;l=67
- upstream bug to star: https://bugs.chromium.org/p/chromium/issues/detail?id=1346939#c23
- Headless: works
Firefox works
WebKit works

Workaround for Chromium for now:

  await page.route('**/empty.pdf', async route => {
    const response = await route.fetch()
    await route.fulfill({
      response,
      headers: {
        ...response.headers(),
        'Content-Disposition': 'attachment',
      }
    });
  });

corradin commented 2 years ago

@mxschmitt Did you also take a look at the solution I proposed in case people are not able to change the source code of the application under test? https://github.com/microsoft/playwright/issues/6342#issuecomment-929112605

This uses the download event for headless chrome and the page event for all other browsers that use the integrated PDF reader.

mxschmitt commented 2 years ago

@corradin yes I've seen that. Creative solution! Question from our side, what in your opinion gets mostly asserted / tested when working with PDF files?

waitForResponse() seems working in all the cases but it seems important to assert the suggested file name?

rwoll commented 2 years ago

I'd love an option on context, like disableNativePDFViewer, to get all the browser to behave the same (including headless v. headful). This has tripped us up a couple times, and right now the best option (IMO), is persistent context with PDF viewer disabled in chrome prefs file, but it would be nice if could be done with a simpler flag in PW API (or even just a command line switch for CR).

Thanks!

corradin commented 2 years ago

@mxschmitt. Not sure, I would like to assert that the stream a user gets back is of type pdf. File name does not matter. In my case the file name is autogenerated so I could not do an exact match either way.

VinayKumarBM commented 2 years ago

Can we expect a simpler solution to PDF Viewer issue any time soon?

WahyuS002 commented 1 year ago

I think there is a typo in the highlighted text below. I think it should say "headed" instead of "headless."

mandras73 commented 10 months ago

Do you have a python version of the workaround?

vincenzo-gasparo commented 10 months ago

Here's a python version of the workaround:

import re
from playwright.sync_api import Page, expect

def test_download_pdf(page: Page):
    page.goto("https://www.example.com")

    def handle_pdf(route):
        response = page.context.request.get(route.request)
        route.fulfill(
            response,
            headers={**response.headers, "Content-Disposition": "attachment"},
        )

    page.route(re.compile(r".*\.pdf"), lambda route: handle_pdf(route))

    with page.expect_download() as download_info:
        page.get_by_text("click_here_to_download_pdf").click()

kinopop commented 4 months ago

Here's a python version of the workaround:

import re
from playwright.sync_api import Page, expect

def test_download_pdf(page: Page):
    page.goto("https://www.example.com")

    def handle_pdf(route):
        response = page.context.request.get(route.request)
        route.fulfill(
            response,
            headers={**response.headers, "Content-Disposition": "attachment"},
        )

    page.route(re.compile(r".*\.pdf"), lambda route: handle_pdf(route))

    with page.expect_download() as download_info:
        page.get_by_text("click_here_to_download_pdf").click()

What plug-ins do I lack?

        response,
error 1 argument response=response, is that true？ my error ：waiting for event " download"

microsoft / playwright

[Feature] Make PDF testing idiomatic #7822

7830

6091

3509

3365

6342