microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
66.05k stars 3.6k forks source link

[Bug]: Page.Route HTML Based Links lead to Wrong URL #30916

Open Vinyzu opened 4 months ago

Vinyzu commented 4 months ago

Version

1.43.0

Steps to reproduce

Example Code (Uses External Website https://hmaker.github.io/selenium-detector):

import asyncio
from playwright.async_api import async_playwright, Playwright
from playwright.async_api import Error, Page, Route, expect

async def main():
    async with async_playwright() as playwright:
        browser = await playwright.chromium.launch(
            headless=False
        )
        context = await browser.new_context()
        page = await context.new_page()

        async def route_handler(route: Route) -> None:
            response = await route.fetch()
            await route.fulfill(response=response)

        await page.context.route("**/*", route_handler)
        await page.goto("https://hmaker.github.io/selenium-detector")
        await page.wait_for_timeout(100000)

asyncio.run(main())

Note: Relevant HTML:

<script src="chromedriver.js"></script>

Expected behavior

Playwright Should Load the Javascript File at https://hmaker.github.io/selenium-detector/chromedriver.js

Actual behavior

Playwright Loads the Javascript File at https://hmaker.github.io/chromedriver.js, which leads to a 404

Additional context

No response

Environment

System:
    OS: Windows 11 10.0.22631
  Binaries:
    Node: 20.12.2 - C:\Program Files\nodejs\node.EXE
    npm: 9.8.1 - C:\Program Files\nodejs\npm.CMD
  Languages:
    Bash: 5.1.16 - C:\Windows\system32\bash.EXE
Vinyzu commented 4 months ago

Note: This happens with all (valid & working) Route URLs which select everything

yury-s commented 4 months ago

The server redirects original https://hmaker.github.io/selenium-detector request to https://hmaker.github.io/selenium-detector/ and route.fetch follows the redirect, downloads the response. When route.fulfill is called, the page's URL in the browser does not change from .../selenium-detector to .../selenium-detector/.

Vinyzu commented 4 months ago

May i ask what the "hard-to-do" label means?

yury-s commented 4 months ago

In this particular case we'd have to change how redirect interception works in each of the three browsers from what I described above to something where the browser becomes aware of the new URL and updates the current document location to the redirected one. This is non-trivial amount of work on the browsers side.