microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
https://playwright.dev
Apache License 2.0
66.41k stars 3.63k forks source link

[Bug]: Unable to extract large report.zip #29451

Closed rkhomi closed 7 months ago

rkhomi commented 8 months ago

Version

1.40.0

Steps to reproduce

I have a large report zip file (550 MB) after CI fun with a lot of failures

npx playwright merge-reports --reporter html ./all-blob-reports Command failed with the following error

Error: invalid local file header signature: 0x1b8e4947 Screenshot 2024-02-12 at 11 14 52

Screenshot 2024-02-12 at 11 15 10

Expected behavior

Should extract the large zip file

Actual behavior

Failed on the extracting zip file

Additional context

No response

Environment

System:
    OS: macOS 13.2.1
    CPU: (10) arm64 Apple M1 Max
    Memory: 2.48 GB / 32.00 GB
  Binaries:
    Node: 20.11.0 - ~/.nvm/versions/node/v20.11.0/bin/node
    Yarn: 1.22.21 - ~/.nvm/versions/node/v20.11.0/bin/yarn
    npm: 10.2.4 - ~/.nvm/versions/node/v20.11.0/bin/npm
  IDEs:
    VSCode: 1.86.0 - /usr/local/bin/code
  Languages:
    Bash: 3.2.57 - /bin/bash
  npmPackages:
    @playwright/test: ^1.40.0 => 1.40.1
rkhomi commented 8 months ago

image

mxschmitt commented 8 months ago

This looks unexpected! Is this reproducible all the time? Would it be possible to share a reproduction with us?

Do you transfer the zip files between different systems?

rkhomi commented 8 months ago

This looks unexpected! Is this reproducible all the time? Would it be possible to share a reproduction with us?

Do you transfer the zip files between different systems?

e2e_playwright:
    strategy:
      fail-fast: false
      matrix:
        shardIndex: [1, 2, 3, 4]
        shardTotal: [4]

       run: 
        docker compose exec -T playwright /devops/sh/pipeline/e2e_playwright.sh ${{ matrix.shardIndex }}/${{ matrix.shardTotal }} ${{ matrix.shardIndex }}

In e2e_playwright.sh yarn playwright test --project=mobile-"$SHARD_INDEX" --project=desktop-"$SHARD_INDEX"

in playwright.config.ts reporter: process.env.APP_ENV === 'pipeline' ? 'blob' : 'html',

It happened once in CI, As far as I understand from my testing here

I have 4 parallel Github jobs (matrix) and I was trying to run 2 projects in each job, and what I saw after running it deletes or overrides previously generated zip file from the first run, so it might have been corrupted during this process

Now I changed my script to see how it goes

export PWTEST_BLOB_DO_NOT_REMOVE=1
PWTEST_BLOB_REPORT_NAME=job-"$SHARD_INDEX" yarn playwright test --project=mobile-"$SHARD_INDEX"  --project=desktop-"$SHARD_INDEX" 
jame-earnin commented 8 months ago

After Upgrading to 1.41.2 from 1.40.2. I also got the similar issue. I do confirm that 1.40.2 works fine but not in 1.41.2. I did revert the Playwright version to 1.40.2 for the workaround. I randomly got the two errors below from 1.41.2.

After upgrading to 1.41.2.

extracting: blob-report/xxx/blob-report/report-1.zip
Error: invalid local file header signature: 0x55f8eb30
    at /runner/_work/xxx/node_modules/playwright-core/lib/zipBundleImpl.js:1:30005
    at /runner/_work/xxxnode_modules/playwright-core/lib/zipBundleImpl.js:1:31700
    at /runner/_work/xxx/node_modules/playwright-core/lib/zipBundleImpl.js:1:17277
    at FSReqCallback.wrapper [as oncomplete] (node:fs:677:5)
Error: Process completed with exit code 1.

And sometimes it also did not return the error and pass it to the next command (mv command).

Start merge app
echo "export default { testDir: 'blob-report', reporter: [['html', { open: 'never' }]], };" > merge.config.ts
# find blob-report/app-e2e/blob-report before
blob-report/app-e2e/blob-report
blob-report/app-e2e/blob-report/report-1.zip
blob-report/app-e2e/blob-report/report-2.zip
# ./node_modules/.bin/playwright merge-reports --config=merge.config.ts ./blob-report/$project-e2e/blob-report
merging reports from /runner/_work/xxx/blob-report/app-e2e/blob-report
extracting: blob-report/app-e2e/blob-report/report-1.zip
...
# find blob-report/app-e2e/blob-report after
blob-report/app-e2e/blob-report/report-1.zip
blob-report/app-e2e/blob-report/report-2.zip
blob-report/app-e2e/blob-report/resources
blob-report/app-e2e/blob-report/report.jsonl
# mv playwright-report ./$project-html-report
mv: cannot stat 'playwright-report': No such file or directory

It produced report.jsonl and resources - not HTML report which is super weird. And there is no playwright-report folder after.

mxschmitt commented 8 months ago

While looking at https://github.com/microsoft/playwright/compare/v1.40.0...v1.41.2 it could be caused by dc8ecc3ca404b211815c5541dfea6e59dbd19b9a.

@jame-earnin how large is your zip / project? Would it be possible to share a reproduction?

jame-earnin commented 8 months ago

image @mxschmitt

mxschmitt commented 8 months ago

While it shows that the blob files are not extremely large, we unfortunately still need reproduction steps to understand what is happening.

To summarise:

It produced report.jsonl and resources - not HTML report which is super weird. And there is no playwright-report folder after.

These are intermediate files, thats expected that they get created.

angelo-loria commented 8 months ago

I'm seeing the same thing as @jame-earnin. It's not every time and it's hard to reproduce. I believe my issues started after updates to my sharding Actions workflow following the breaking changes introduced with actions/upload-artifact V4 and actions/download-artifact V4, and at first I thought it was an issue with the actions/download-artifact failing on large file sizes. I'm on Playwright 1.41.1.

This is the output from an Actions run that had three shards in it:

Current runner version: '2.313.0'
Operating System
 Ubuntu
 22.04.3
 LTS
Run actions/download-artifact@v4
Found 8 artifact(s)
Filtering artifacts by pattern 'all-blob-reports-*'
Preparing to download the following artifacts:
- all-blob-reports-2-program (ID: 1243390861, Size: 3158804)
- all-blob-reports-3-program (ID: 1243390177, Size: 628)
- all-blob-reports-1-program (ID: 1243390167, Size: 5242)
- all-blob-reports-2-review (ID: 1243390152, Size: 629)
- all-blob-reports-2-communication (ID: 1243389754, Size: 631)
- all-blob-reports-3-communication (ID: 1243389039, Size: 628)
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-fd82a3b8-38ce-5edb-b762-7ae5d074c0f5/artifacts/56d0db96bc72c2b9c9ef39089ac1e3a03d7350ea10380501b9f8331c633c8b56.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-d9a8428a-a04d-57fd-b073-bc6d7d850202/artifacts/debdab4df3055f16c055dbe9aafbdc5363cb84e8e1fb6c11f098f3b0c4be6738.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-bdfb5981-f02f-583f-ab5a-5a5bec73e8ff/artifacts/0ec1883a156ab10ee5e433ad534291e3ad403b45d2d73246ed041c628ee45538.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-f8c8d5b3-3e72-5f79-ccbe-15b471e4ad/artifacts/a827eec769ddb9eb0f85c8e5cfdd5989a319dfb92ee36bd4068e880e325e891a.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-f03aea1c-12fa-59-3f20-2c0fe93100a4/artifacts/f3b44b6b1530be177bb79843a1456bb12075c193d638cb91b345093f342e84.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
Redirecting to blob download url: https://productionresultssa10.blob.core.windows.net/actions-results/e9e215e7-753c-4683-9c61-37cb5343ed58/workflow-job-run-d5c870aa-6533-58f6-5b65-8ac822d71d/artifacts/2ee284d526cf8417ff0e2ed66a72a000c8b97318ed9c46f46a1c5d34dcc6d932.zip
Starting download of artifact to: /home/runner/work/project/playwright/all-blob-reports
(node:1804) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Artifact download completed successfully.
Total of 6 artifact(s) downloaded
Download artifact has finished successfully
0s

Run PLAYWRIGHT_JUNIT_OUTPUT_NAME=junit.xml npx playwright merge-reports --reporter=junit ./all-blob-reports
merging reports from /home/runner/work/project/playwright/all-blob-reports
extracting: all-blob-reports/report-1.zip
extracting: all-blob-reports/report-2.zip
Error: invalid local file header signature: 0x0
    at /home/runner/work/project/playwright/node_modules/playwright-core/lib/zipBundleImpl.js:1:30005
    at /home/runner/work/project/playwright/node_modules/playwright-core/lib/zipBundleImpl.js:1:31700
    at /home/runner/work/project/playwright/node_modules/playwright-core/lib/zipBundleImpl.js:1:17701
    at FSReqCallback.wrapper [as oncomplete] (node:fs:686:5)
Error: Process completed with exit code 1

This is an example of another run where the merge step saying it's successful but the following step that looks for that merged report fails.

Run PLAYWRIGHT_JUNIT_OUTPUT_NAME=junit.xml npx playwright merge-reports --reporter=junit ./all-blob-reports
merging reports from /home/runner/work/project/playwright/all-blob-reports
extracting: all-blob-reports/report-1.zip
0s

Run dorny/test-reporter@v1
Check runs will be created with SHA=5a920bcd1ee374943b53ff4f5413bd93db695a
Listing all files tracked by git
Found 89 files tracked by GitHub
Using test report parser 'java-junit'
Creating test report Playwright Test Report review-board
Error: No test report files were found

Unfortunately this is in a giant project at work that I can't post here but I'll try to get some reproducible code up.

angelo-loria commented 8 months ago

Well, I am unable to reproduce this issue in a personal project of mine. I also reverted the project in which this is happening in to 1.40.0 and 1.40.2 and I experience the issue in both versions. I've tried larger ubuntu runners (8-core) with no luck.

yury-s commented 8 months ago

I have 4 parallel Github jobs (matrix) and I was trying to run 2 projects in each job, and what I saw after running it deletes or overrides previously generated zip file from the first run, so it might have been corrupted during this process

Sounds like there is an issue with the zip files. It's unclear wether the problem is caused by a problem in parallel jobs configuration, upgrade to upload/download-artifacts to v4 or some resource constraints (too big files) or something in playwright. We need a repro to take an action on this. If you see the error on one of the files it's likely a broken zip and extracting it manually will also fail.

yury-s commented 7 months ago

We need more information to act on this report. Please file a new one and link to this issue when you get back to it!

FranciscoKnebel commented 7 months ago

I have run into this exact issue, after upgrading to using upload/download-artifacts v4 and using merge-reports. I'm not able to share more code or the artifacts themselves.

These are the artifacts generated: image image image

The workflow file runs a very specific scenario, where I shard into multiple jobs depending on a spec list (array of strings containing the paths to the spec), and then run max-parallel of 1, to guarantee that each block of specs runs sequentially, which contains the following steps:

jobs:
  test:
    timeout-minutes: 1440
    runs-on: self-hosted
    strategy:
      fail-fast: false
      max-parallel: 1
      matrix:
        shard: ${{ fromJSON(github.event.inputs.specs) }}
    steps:
    - uses: actions/checkout@v4
    - name: Setup Node
      uses: actions/setup-node@v4
      with:
        node-version: 20
    - name: Install dependencies
      run: npm ci
    - name: Install Playwright Browsers
      run: npx playwright install --with-deps
    - name: Run Playwright tests
      run: npx playwright test ${{ matrix.shard }} -c playwright.config.ts
    - name: Get random number to use in report
      if: always()
      id: generate_number
      run: echo "random_number=$(echo $RANDOM)" >> $GITHUB_OUTPUT
      shell: bash
    - name: Upload report blob
      uses: actions/upload-artifact@v4
      if: always()
      with:
        name: blob-reports-${{ steps.generate_number.outputs.random_number }}
        path: blob-report
        retention-days: 1
  merge-reports:
    runs-on: self-hosted
    if: always()
    needs: [test]
    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: 20
    - name: Install dependencies
      run: npm ci
    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
        merge-multiple: true
    - name: Merge into HTML Report
      run: npx playwright merge-reports --reporter html ./all-blob-reports # outputs to playwright-report folder
  ...

test job will run a matrix, that generates N blob-reports-_RANDOMNUMBER artifacts. merge-reports then downloads the artifacts with that pattern, with merge-multiple enabled.

But merging fails, with the exact same problem as other users mentioned:

image

I read that this error might occur when trying to unzip something that is not a zip file.

Downloading the files locally, extracting them into the all-blop-reports folder while renaming all of the "report.zip" files to "report.zip", "report (1).zip", "report (2).zip" and so on, then running npx playwright merge-reports --reporter HTML ./all-blob-reports works, just does not work in CI:

These artifact sizes are normal, have opened reports before that were 3x their accumulated sizes.

FranciscoKnebel commented 7 months ago

image image

Tested with smaller blobs, same result, error on the same lines. The extraction changed with download-artifact@v4, so perhaps the extracted files are now no longer zipped, and it's trying to unzip something that is not a zip file?

FranciscoKnebel commented 7 months ago

error is thrown here, in openReadStream, in playwright-core/lib/zipBundleImpl.js:

image

mxschmitt commented 7 months ago

Some recent investigation on our end showed that this might be caused by the following:

as per here we should change

if: always()

to

if: ${{ !cancelled() }}

Which should fix this issue. Would appreciate if you could try testing it before we roll it out across docs/create-playwright. Its a very early investigation, so haven’t even tried it but looks promising. Thanks!

FranciscoKnebel commented 7 months ago

Some recent investigation on our end showed that this might be caused by the following:

  • Run gets cancelled
  • npx playwright test did not finish writing all the blobs
  • if: always() kicks in and uploads a broken blob
  • From there on things don’t work anymore.

as per here we should change

if: always()

to

if: ${{ !cancelled() }}

Which should fix this issue. Would appreciate if you could try testing it before we roll it out across docs/create-playwright. Its a very early investigation, so haven’t even tried it but looks promising. Thanks!

Thanks for the investigation. I ran some tests to confirm this, and it no longer fails. However, there might be another problem with merging the reports. Following test split in 3 blobs:

image

To run tests, updated actions/upload-artifact@v4 step to use !cancelled:

test
  (...)
   - name: Install Playwright Browsers
      run: npx playwright install --with-deps
   - name: Run Playwright tests on demo
      run: npx playwright test ${{ matrix.shard }} -c playwright.config.ts
   - name: Upload report blob
      uses: actions/upload-artifact@v4
      if: ${{ !cancelled() }}
      with:
        name: blob-reports-${{ steps.generate_number.outputs.random_number }}
        path: blob-report
        retention-days: 1

And on the second job, to merge the reports:

merge-reports:
    runs-on: self-hosted
    permissions:
      id-token: write
      contents: read
      pull-requests: write
    if: ${{ !cancelled() }}
    needs: [test]
    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with:
        node-version: 20
    - name: Install dependencies
      run: npm ci
    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
        merge-multiple: true
    - name: Merge into HTML Report
      run: npx playwright merge-reports --reporter html ./all-blob-reports # outputs to playwright-report folder
      (...)
    - name: Upload HTML report to S3
      run: aws s3 sync ./playwright-report s3://automation-reports.ggoutfitters.com/playwright/${{ github.run_id }}/

Can see from the first image, compared to the local execution in https://github.com/microsoft/playwright/issues/29451#issuecomment-1973228011, that it looks like the merge didn't finish the process before passing to the next step in the job.

In the second image, you can see the download of the artifacts and that it started to run the merge-reports command, but the playwright-report folder wasn't created.

I'm trying again and going to list the contents in the directory before the final report upload attempt, to see if it's reproducible.

EDIT:

Another attempt, now listing directory content: image

FranciscoKnebel commented 7 months ago

Hey @mxschmitt Thanks for reopening this issue, it's really important to me that this gets fixed so we can start using the test reports again. I took a look in the referenced PRs and tested the zip integrity as well:

image

Maybe this problem is with the merge-multiple: true option used with download-artifact@v4 ?

Did a follow-up test without the merge-multiple option:

    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
    - name: Report Integrity check
      shell: bash
      run: |
        for file in all-blob-reports/*.zip; do
          unzip -t $file
        done
    - name: Merge into HTML Report
      run: npx playwright merge-reports --reporter html ./all-blob-reports

and the integrity check passed: image

I'm doing another test next, just need to have unar installed in my self-hosted runner, where I'll do the merging manually. Locally it worked, going to confirm if this works in the runner:

    - name: Download blob reports from GitHub Actions Artifacts
      uses: actions/download-artifact@v4
      with:
        pattern: blob-reports-*
        path: all-blob-reports
    - name: Report Integrity check
      shell: bash
      run: |
        for file in all-blob-reports/*.zip; do
          unzip -t $file
        done
    - name: Extract blob artifacts
      shell: bash
      run: |
        for z in all-blob-reports/*.zip; do unar -r "$z" -o reports; done
    - name: Merge into HTML Report
      if: ${{ !cancelled() }}
      run: npx playwright merge-reports --reporter html ./reports # outputs to playwright-report folder

This will download all the blobs into all-blob-reports. Those will be multiple blob-reports-#.zip, each containing a report.zip file. Haven't found a way using unzip to handle renaming of these zip files, but unar does. for z in all-blob-reports/*.zip; do unar -r "$z" -o reports; done will extract all the artifact files, saving report.zip, report-1.zip, report-2.zip and so on, which playwright merge-reports handles.

Screenshot 2024-03-22 193805

I'll confirm if this worked.

mxschmitt commented 7 months ago

Maybe related to https://github.com/actions/download-artifact/issues/298

mxschmitt commented 7 months ago

I tried to reproduce with this workflow, but was not able to: https://github.com/mxschmitt/test/actions/runs/8397555420/job/23001044584