[BUG] very slow screenshot comparison

DudaGod commented 1 year ago

System info

Playwright Version: v1.37.1
Operating System: macOS 12.3
Browser: Chromium

Source code

Config file

// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  projects: [
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'], },
    },
  ]
});

Test file (self-contained)

import { test, expect } from '@playwright/test';

test('test long page', async ({page}) => {
    await page.setViewportSize({
        width: 1280,
        height: 1024,
    });

    await page.goto('https://bank.yandex.ru');

    await expect(page).toHaveScreenshot('foo.png', {fullPage: true, animations: 'disabled'});
});

Steps

Run the test (screenshot accepted
Change viewport width to 1075 (for example) in order to get screenshot difference
Run the test again -> screenshot comparison takes more than 5000 ms in most cases

Expected

[Describe expected behavior]

Comparing screenshots takes less than 5 seconds and no unnecessary comparisons are made under the hood

[Describe actual behavior]

Test fails when taking a screenshot (timeout).

I looked at the code sources and found that when the reference and the current image do not match, a comparison is made between the two screenshots just taken (previous actual and new actual). It takes a lot of time (more than 5 sec). For what purpose is this done? Is it possible not to do this?

What takes the most time:

screenshot page - ~1-2 sec (x2, for actual and previous actual)
compare with pixelmatch - ~2-3 sec (x2, compare actual with reference and actual with previous actual)

pavelfeldman commented 1 year ago

I'm seeing

toHaveScreenshot: 965.309ms
screenshot: 457.735ms

Most of the time is spent decoding PNG (~400ms).

DudaGod commented 1 year ago

Hm, I use - MacBook Pro 16-inch, 2019; CPU: 2,6 GHz 6-Core Intel Core i7; Memory: 32 GB 2667 MHz DDR4. And I cannot make it faster =(

Which device do you use?

Can you explain why the two images (previous actual and latest actual) are compared here - https://github.com/microsoft/playwright/blob/main/packages/playwright-core/src/server/page.ts#L550-L571? I see that it takes a lot of time to me and they are always equivalent for me. Looks like a useless action. Maybe we can add ability to disable it?

aslushnikov commented 1 year ago

Hey @DudaGod, let me chime in!

Can you explain why the two images (previous actual and latest actual) are compared here - https://github.com/microsoft/playwright/blob/main/packages/playwright-core/src/server/page.ts#L550-L571?

The strategy is described in our documentation: https://playwright.dev/docs/api/class-pageassertions#page-assertions-to-have-screenshot-1
There's also a video where we talked about it: https://www.youtube.com/watch?v=keV2CIgtBlg&list=PLQ6Buerc008fTzmUIPkPo3Eis0vjx7ykn&index=8&t=321s

What kind of explanation are you looking for?

I see that it takes a lot of time to me and they are always equivalent for me.

They are equivalent for the page in a stable state, and might not be equivalent if the page is settling down (imagine a spinner loading some content e.t.c.). The comparison itself is instant; the majority of time is consumed encoding / decoding PNG's.

Are there any low-hanging fruits that we can collect to significantly improve situation?

DudaGod commented 1 year ago

The strategy is described in our documentation: https://playwright.dev/docs/api/class-pageassertions#page-assertions-to-have-screenshot-1

In documentation I see this text:

This function will wait until two consecutive page screenshots yield the same result, and then compare the last screenshot with the expectation.

But this text is different from what I see in the source code:

first lap in while loop before call areEqualScreenshots (here)
- previous: undefined
- actual: Buffer
- isFirstIteration: true
- expectation: Buffer (expected imaged)
- as a result on first lap of while loop we compare expected image with actual
if images are not equal then we go to the next loop (until the current and previous images are equal):
- previous Buffer (prev actual)
- actual: Buffer (new actual)
- isFirstIteration: false
- expectation: Buffer (prev actual)
- as a result on next lap of while loop we compare actual image with prev actual image

And I couldn’t understand the meaning of such a comparison. After watching the video and reading the documentation looks like there is a bug in the code.

They are equivalent for the page in a stable state, and might not be equivalent if the page is settling down (imagine a spinner loading some content e.t.c.).

But it looks redundant. I open page with waitUntil: domcontentloaded, before screenshot I wait until locator become visible and I disable animations before screenshot. Could the page be in an unstable state in this case? Looks like no.

Are there any low-hanging fruits that we can collect to significantly improve situation?

If you don't want to give up the logic of comparing the actual image and the previous one then we can at least just compare the buffers of actual and previous. If they are not equal then we compare them as png images.

But I would like these unnecessary comparisons not to be made or I could manage them by using some option.

aslushnikov commented 1 year ago

And I couldn’t understand the meaning of such a comparison. After watching the video and reading the documentation looks like there is a bug in the code.

@DudaGod The first comparison is a shortcut that happens only if the expected exists. This way we achieve a fast comparison for a "happy-path", i.e.:

the page is stable right away
the page didn't change, so the screenshots match

But it looks redundant. I open page with waitUntil: domcontentloaded, before screenshot I wait until locator become visible and I disable animations before screenshot. Could the page be in an unstable state in this case?

It can; domcontentloaded is racing against image and font loading, iframe loading, js-driven DOM modifications, and so on.

DudaGod commented 1 year ago

The first comparison is a shortcut that happens only if the expected exists. This way we achieve a fast comparison for a "happy-path", i.e.:

Okay and even in such a case, I still don’t understand why we compare actual and previous actual when we already have a diff in expected and actual. I mean this additional comparison won't change anything.

It can; domcontentloaded is racing against image and font loading, iframe loading, js-driven DOM modifications, and so on.

To my mind user should wait it explicitly using API. I mean that the hack of taking 2 actual screenshots in a row and then comparing them does not solve this problem (they may accidentally be equal).

yury-s commented 1 year ago

The first comparison is a shortcut that happens only if the expected exists. This way we achieve a fast comparison for a "happy-path", i.e.:

Okay and even in such a case, I still don’t understand why we compare actual and previous actual when we already have a diff in expected and actual. I mean this additional comparison won't change anything.

This is because we want to stop taking screenshots as soon as we get a stable one. The stable one may or may not match original one after a few iterations. E.g. it can be actual1 != actual2 == actual3 and in case actual3 != expected if we were comparing only to the expected image we'd continue the process.

aslushnikov commented 1 year ago

@DudaGod putting code-talk aside, is there anything specific that requires our attention? What do you suggest?

DudaGod commented 1 year ago

This is because we want to stop taking screenshots as soon as we get a stable one. The stable one may or may not match original one after a few iterations. E.g. it can be actual1 != actual2 == actual3 and in case actual3 != expected if we were comparing only to the expected image we'd continue the process.

Hm, I didn’t notice in the source code that at the end (if we found stable actual) another call to the areEqualScreenshots method is made with latest actual image (here).

I don't like this solution anyway. In our team, we implemented our own comparison command using our looks-same library and without performing unnecessary comparisons (imho).

putting code-talk aside, is there anything specific that requires our attention? What do you suggest?

I would add a comparison of buffers before converting them to PNG images. In some cases this can significantly save time.

yury-s commented 1 year ago

I don't like this solution anyway. In our team, we implemented our own comparison command using our looks-same library and without performing unnecessary comparisons (imho).

Which comparison you think is unnecessary if we want to stick with the algorithm that waits for the page screenshot to stabilize? Some of the comparisons could be sped-up but I don't see how we can skip them.

putting code-talk aside, is there anything specific that requires our attention? What do you suggest?

I would add a comparison of buffers before converting them to PNG images. In some cases this can significantly save time.

This would speedup things when the images are the same bitwise but in the bug description you mention that in your scenario first comparison with expected image fails, in that case the quick check for buffer equality would also fail and won't give you any performance gain or are you talking about a different use case?

DudaGod commented 1 year ago

Which comparison you think is unnecessary if we want to stick with the algorithm that waits for the page screenshot to stabilize?

I believe that stabilizing actual images in this way is a hack and only cover up the real user problem.

For example, imagine that the scroll bar is not hidden by default in pwt and I don’t have a scroll bar on expected image. But on every first screenshot of the actual image the scrollbar is present and does not match to the expected image. In one of the following current screenshots the scroll bar disappears. As a result, at some point new actual match with expected, but I spend too much time on this on runtime. Instead of just disabling scrollbars when see the problem on diff.

Here you can imagine any other example. I want to say that the user should be responsible for preparing the element for the screenshot.

This would speedup things when the images are the same bitwise but in the bug description you mention that in your scenario first comparison with expected image fails, in that case the quick check for buffer equality would also fail and won't give you any performance gain or are you talking about a different use case?

My problem was that the first comparison between expected and actual fails (height of screen 5k px) and next comparison between new actual and previous actual timed out (although the pictures were identical by buffer). So if we compare previous actual with new actual by buffers then we can save time.

yury-s commented 1 year ago

Discussed this in a meeting today, things that we can implement:

do buffer comparison as a fast check before parsing png
introduce separate timeout config for toHaveScreenshot (currently it is the same as TestConfig.expect: { timeout: 100 })
supporttimeout: 1 for one-shot snapshots without waiting for stable images

dgozman commented 1 year ago

@DudaGod Would you like to take a stab at the above changes?

DudaGod commented 1 year ago

@DudaGod Would you like to take a stab at the above changes?

Yes.

anoblet commented 1 year ago

This is something we could definitely use.

drupal-spider commented 1 month ago

Came across same issue when compared a very long page (https://www.dss.gov.au/our-responsibilities/disability-and-carers/publications-articles/carer-payment-child-a-new-approach-report-of-the-carer-payment-child-review-taskforce)

Environment

macOS 15.0.1 CPU: Apple M1 Max Memory: 32 GB Store: 929.95 GB available

Node.js v22.9.0 NPM 10.8.3 @playwright/test@1.48.1 Chromium Version 130.0.6723.31 (Developer Build) (arm64)

Reproduce code:

import { expect, Locator, test } from '@playwright/test'

test('temp', async ({ page }) => {
    await page.goto('https://www.dss.gov.au/our-responsibilities/disability-and-carers/publications-articles/carer-payment-child-a-new-approach-report-of-the-carer-payment-child-review-taskforce');
    await expect(page).toHaveScreenshot('temp.png', { maxDiffPixels: 16, fullPage: true})
  });

Result

Test with webkit successe in 8s Config:

{
  name: 'webkit',
  use: { ...devices['Desktop Safari'] },
},

Test with chromium or firefox Fail due to timeout after 60s

Error with Chromium:

Tearing down "context" exceeded the test timeout of 60000ms.

Error: Test ended.
Browser logs:

<launching> /Users/Library/Caches/ms-playwright/chromium-1140/chrome-mac/Chromium.app/Contents/MacOS/Chromium --disable-field-trial-config --disable-background-networking --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-component-update --no-default-browser-check --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=ImprovedCookieControls,LazyFrameLoading,GlobalMediaControls,DestroyProfileOnBrowserClose,MediaRouter,DialMediaRouteProvider,AcceptCHFrame,AutoExpandDetailsElement,CertificateTransparencyComponentUpdater,AvoidUnnecessaryBeforeUnloadCheckSync,Translate,HttpsUpgrades,PaintHolding,ThirdPartyStoragePartitioning,LensOverlay,PlzDedicatedWorker --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --disable-search-engine-choice-screen --unsafely-disable-devtools-self-xss-warnings --enable-use-zoom-for-dsf=false --use-angle --headless=old --hide-scrollbars --mute-audio --blink-settings=primaryHoverType=2,availableHoverTypes=2,primaryPointerType=4,availablePointerTypes=4 --no-sandbox --user-data-dir=/var/folders/6c/8_h315d90n3d2tq_zbww5g180000gn/T/playwright_chromiumdev_profile-AhlOVs --remote-debugging-pipe --no-startup-window
<launched> pid=28687
[pid=28687][err] Old Headless mode will be removed from the Chrome binary soon. Please use the new Headless mode (https://developer.chrome.com/docs/chromium/new-headless) or the chrome-headless-shell which is a standalone implementation of the old Headless mode (https://developer.chrome.com/blog/chrome-headless-shell).
[pid=28687][err] 
[pid=28687][err] [1017/001545.858826:WARNING:viz_main_impl.cc(85)] VizNullHypothesis is disabled (not a warning)
[pid=28687][err] [1017/001545.861467:ERROR:cv_display_link_mac.mm(238)] CVDisplayLinkCreateWithCGDisplay failed. CVReturn: -6670
[pid=28687][err] [1017/001545.861526:ERROR:cv_display_link_mac.mm(238)] CVDisplayLinkCreateWithCGDisplay failed. CVReturn: -6670
[pid=28687][err] [1017/001545.861571:ERROR:cv_display_link_mac.mm(238)] CVDisplayLinkCreateWithCGDisplay failed. CVReturn: -6670
[pid=28687][err] [1017/001545.868528:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001545.879873:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001545.970191:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001546.520344:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001546.639660:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001546.685691:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001546.690571:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001547.009247:ERROR:gl_display.cc(497)] EGL Driver message (Error) eglQueryDeviceAttribEXT: Bad attribute.
[pid=28687][err] [1017/001547.033339:FATAL:skia_output_surface_impl.cc(1279)] 
[pid=28687][err]   surface_size=1280x89362
[pid=28687][err]   format=6
[pid=28687][err]   color_type=6
[pid=28687][err]   backend_format.isValid()=1
[pid=28687][err]   backend_format.backend()=0
[pid=28687][err]   GrBackendFormats::AsGLFormat(backend_format)=6
[pid=28687][err]   sample_count=1
[pid=28687][err]   surface_origin=0
[pid=28687][err]   willGlFBO0=0
[pid=28687][err] 0   Chromium Framework                  0x00000001188cf204 base::debug::CollectStackTrace(base::span<void const*, 18446744073709551615ul, void const**>) + 28
[pid=28687][err] 1   Chromium Framework                  0x00000001188c2d50 base::debug::StackTrace::StackTrace() + 100
[pid=28687][err] 2   Chromium Framework                  0x000000011882f5fc logging::LogMessage::Flush() + 152
[pid=28687][err] 3   Chromium Framework                  0x0000000118830200 logging::LogMessageFatal::~LogMessageFatal() + 12
[pid=28687][err] 4   Chromium Framework                  0x0000000118830218 logging::LogMessageFatal::~LogMessageFatal() + 0
[pid=28687][err] 5   Chromium Framework                  0x000000011e05a45c viz::SkiaOutputSurfaceImpl::Reshape(viz::OutputSurface::ReshapeParams const&) + 0
[pid=28687][err] 6   Chromium Framework                  0x000000011e05a004 viz::SkiaOutputSurfaceImpl::RecreateRootDDLRecorder() + 204
[pid=28687][err] 7   Chromium Framework                  0x000000011e05a77c viz::SkiaOutputSurfaceImpl::Reshape(viz::OutputSurface::ReshapeParams const&) + 800
[pid=28687][err] 8   Chromium Framework                  0x000000011e08fb54 viz::DirectRenderer::DrawFrame(std::__Cr::vector<std::__Cr::unique_ptr<viz::AggregatedRenderPass, std::__Cr::default_delete<viz::AggregatedRenderPass>>, std::__Cr::allocator<std::__Cr::unique_ptr<viz::AggregatedRenderPass, std::__Cr::default_delete<viz::AggregatedRenderPass>>>>*, float, gfx::Size const&, gfx::DisplayColorSpaces const&, std::__Cr::vector<gfx::Rect, std::__Cr::allocator<gfx::Rect>>) + 2268
[pid=28687][err] 9   Chromium Framework                  0x000000011e0cbdec viz::Display::DrawAndSwap(viz::DrawAndSwapParams const&) + 6888
[pid=28687][err] 10  Chromium Framework                  0x000000011e0d1ea0 viz::DisplayScheduler::DrawAndSwap() + 436
[pid=28687][err] 11  Chromium Framework                  0x000000011e0d11e0 viz::DisplayScheduler::OnBeginFrameDeadline() + 60
[pid=28687][err] 12  Chromium Framework                  0x00000001144dcca4 base::internal::Invoker<base::internal::FunctorTraits<void (net::(anonymous namespace)::DnsHTTPAttempt::*&&)(), base::WeakPtr<net::(anonymous namespace)::DnsHTTPAttempt>&&>, base::internal::BindState<true, true, false, void (net::(anonymous namespace)::DnsHTTPAttempt::*)(), base::WeakPtr<net::(anonymous namespace)::DnsHTTPAttempt>>, void ()>::RunOnce(base::internal::BindStateBase*) + 88
[pid=28687][err] 13  Chromium Framework                  0x00000001188b5858 base::DeadlineTimer::OnScheduledTaskInvoked() + 68
[pid=28687][err] 14  Chromium Framework                  0x0000000118878b18 base::TaskAnnotator::RunTaskImpl(base::PendingTask&) + 320
[pid=28687][err] 15  Chromium Framework                  0x0000000118891b50 base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWorkImpl(base::LazyNow*) + 744
[pid=28687][err] 16  Chromium Framework                  0x0000000118891704 base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork() + 100
[pid=28687][err] 17  Chromium Framework                  0x00000001188d90d0 base::MessagePumpCFRunLoopBase::RunWork() + 128
[pid=28687][err] 18  Chromium Framework                  0x00000001188d4740 base::apple::CallWithEHFrame(void () block_pointer) + 16
[pid=28687][err] 19  Chromium Framework                  0x00000001188d876c base::MessagePumpCFRunLoopBase::RunDelayedWorkTimer(__CFRunLoopTimer*, void*) + 68
[pid=28687][err] 20  CoreFoundation                      0x0000000196260184 __CFRUNLOOP_IS_CALLING_OUT_TO_A_TIMER_CALLBACK_FUNCTION__ + 32
[pid=28687][err] 21  CoreFoundation                      0x000000019625fe28 __CFRunLoopDoTimer + 1012
[pid=28687][err] 22  CoreFoundation                      0x000000019625f938 __CFRunLoopDoTimers + 356
[pid=28687][err] 23  CoreFoundation                      0x00000001962450f0 __CFRunLoopRun + 1856
[pid=28687][err] 24  CoreFoundation                      0x0000000196244334 CFRunLoopRunSpecific + 572
[pid=28687][err] 25  Foundation                          0x0000000197402918 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212
[pid=28687][err] 26  Chromium Framework                  0x00000001188d9528 base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 156
[pid=28687][err] 27  Chromium Framework                  0x00000001188d8158 base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 204
[pid=28687][err] 28  Chromium Framework                  0x00000001188921c4 base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run(bool, base::TimeDelta) + 352
[pid=28687][err] 29  Chromium Framework                  0x000000011885cf6c base::RunLoop::Run(base::Location const&) + 400
[pid=28687][err] 30  Chromium Framework                  0x00000001188aedb0 base::Thread::Run(base::RunLoop*) + 60
[pid=28687][err] 31  Chromium Framework                  0x00000001188aef40 base::Thread::ThreadMain() + 384
[pid=28687][err] 32  Chromium Framework                  0x00000001188c1cf0 base::(anonymous namespace)::ThreadFunc(void*) + 188
[pid=28687][err] 33  libsystem_pthread.dylib             0x000000019615f2e4 _pthread_start + 136
[pid=28687][err] 34  libsystem_pthread.dylib             0x000000019615a0fc thread_start + 8
[pid=28687][err] Task trace:
[pid=28687][err] 0   Chromium Framework                  0x000000011e0d299c viz::DisplayScheduler::ScheduleBeginFrameDeadline() + 292
[pid=28687][err] 1   Chromium Framework                  0x0000000119e98650 viz::DelayBasedTimeSource::PostNextTickTask(base::TimeTicks) + 296
[pid=28687][err] 2   Chromium Framework                  0x0000000119e98650 viz::DelayBasedTimeSource::PostNextTickTask(base::TimeTicks) + 296
[pid=28687][err] 3   Chromium Framework                  0x0000000119e98650 viz::DelayBasedTimeSource::PostNextTickTask(base::TimeTicks) + 296
[pid=28687][err] 4   Chromium Framework                  0x0000000119e98650 viz::DelayBasedTimeSource::PostNextTickTask(base::TimeTicks) + 296
[pid=28687][err] Task trace buffer limit hit, update PendingTask::kTaskBacktraceLength to increase.
[pid=28687][err] 
[pid=28687] <gracefully close start>
[pid=28687] <kill>
[pid=28687] <will force kill>

Slow test file: [chromium] › temp.spec.ts (1.1m) Consider splitting slow test files to speed up parallel execution

microsoft / playwright