codecov / engineering-team

This is a general repo to use with GH Projects
1 stars 1 forks source link

Addressing CI Reporting and Use of Upload/Error Status #2220

Open codecovdesign opened 3 months ago

codecovdesign commented 3 months ago

Problem to solve

Codecov displayed in the UI that the upload for one of the reports failed, but the CI did not fail and exited successfully. Additionally, they received the pull request comment with coverage data despite the upload failure. This raises the question of whether this could be a processing failure rather than an error in the upload step.

Furthermore, another example reported an engineer pushed a commit with a future date (Aug 1st) due to their system clock being set ahead. This caused Codecov to show the latest upload as occurring sometime in the future. The customer is requesting a way to delete the offending commits from Codecov.

issue 1, upload is shown with X:

Image

then user lands on PR page and sees CI passed, but no data:

Image

Summary of areas to investigate:

related issue: https://github.com/codecov/engineering-team/issues/2442

Solution

Investigation note: https://github.com/codecov/engineering-team/issues/2220#issuecomment-2344838878 Figma link: https://www.figma.com/design/4Z7yb2dkIIATkfzpWoMYQq/GH-2220?node-id=1-2

Image

We are going to decouple the fixes in the issues below so we can ship some of the designs sooner.

### Implementation tasks
- [ ] https://github.com/codecov/engineering-team/issues/2521
- [ ] https://github.com/codecov/engineering-team/issues/2522
- [ ] https://github.com/codecov/engineering-team/issues/2523
- [ ] https://github.com/codecov/engineering-team/issues/2524
codecovdesign commented 3 months ago

Sync with @Adal3n3 @drazisil-codecov @vlad-ko

Related issues

codecovdesign commented 2 months ago

related: https://github.com/codecov/engineering-team/issues/2400

codecovdesign commented 2 months ago

Related feedback internally

This screen feels like a “work in progress”. You have the list of “view CI build” links. But in my case the CI succeeded and also the upload of the coverage report succeeded but only the processing of the upload failed (because the report had broken stuff in it). Also I can download the uploaded reports on the right side, but I do not get the relation to the CI build that was used doing that upload. If would be cooler to have one list that shows me what upload was broken, with a link to download the upload and a link to the corresponding CI build. Oh, and the failed stuff is most of the time just a “still working on it”, so if you refresh the page a couple of minutes later, the failures are gone. Would be cool if it could say that it is still in processing and not failed.

Image

Adal3n3 commented 2 months ago

Meeting note w @calvin-codecov 9/6: Question 1: Why did upload fail, but we still show that the CI passed? Image

Question 2: In this example, why do we show that the CI passed and the PR states is "all modified lines are covered by tests ✅" even tho the commit has a failed upload? Image

codecovdesign commented 2 months ago

sync from sept 9th:

Adal3n3 commented 2 months ago

Investigation note:

  1. The “uploads” section is actually a “Coverage reports/uploads version history”, it contains both old and new uploads. If anyone from your team re-runs all jobs, a new version of the uploads will be generated. So any previous uploads with error messages represent older versions, and the errors are glued to those specific versions.
    • Suggestion: We need a mechanism to detect old vs. new uploads. This will require API and database work. As a user, I would expect to focus on the latest upload, so removing older uploads could be helpful.
  2. The CI status and upload status are working as intended. They are two different statuses with no direct relationship. The CI status is from the commit/pull checks and they are unrelated to Codecov checks. Codecov considers all non-codecov statuses to be CI statuses. This means your CI status is based on all your non-codecov related checks so if your Codecov checks fail while other checks pass, you will still see a green “CI passed”. Note 1 and 2 explain why you might have a passing CI even your upload failed. Image Image
  3. This page primarily shows data from the very first job attempts, which is why we’ve received feedback about mismatched information between the UI, GitHub, and PR comments. GitHub always shows users the latest jobs, but the UI doesn’t update from the first run to the latest run. From a typical user’s perspective, when they see an error, they expect to visit the app, fix the error, and have the issue resolved. However, in the current app experience, Codecov doesn't reflect the re-run of all jobs. Users need to open a fresh commit to display their coverage reports. Suggestions:
    1. Consider supporting the “re-run all jobs”, and always display the latest job.
    2. or direct users to open a new commit to display the updated coverage report.
  4. Tricky case: The UI currently shows “Upload Failed” when the upload is still processing. Once the upload completes, the failure message disappears, which can be misleading.
    • Solution: Display a "Upload Processing" status while the upload is in progress. Image

(cc @calvin-codecov)

Adal3n3 commented 2 months ago

Existing upload status:

calvin-codecov commented 2 months ago

It would be helpful if we could figure out which uploads correspond to which run job number Image (https://github.com/codecov/gazebo/actions/runs/10355997185)

Adal3n3 commented 2 months ago

From @vlad-ko

I think I've fond another "bug/feature" situation with the CFF. If i hav a flag that's designated as a CFF, if i upload a broken report, we will sort of ignore it and just use the CFF version of the report. It could be a feature, but it could also be a problem, because the customer may not realize that we're not using the latest data.

ToDo: @calvin-codecov can you investigate this? If it's true, we need either reject the CFF broken report, and show an error message, and somehow indicate we are not using the latest data or see if we can support the latest data.

Adal3n3 commented 2 months ago
kfbustam commented 2 months ago

As a result of these failures, the code highlighting could potentially be inaccurate on the Github Files Changed tab right? Is it possible to surface a message due to that somewhere in the Github UI?

Maybe instead of: Image or Image

"We haven't received the signal to send notifications which might mean reports are still being uploaded, so code highlighting might be inaccurate. Thanks for your patience" "Not all reports were successfully processed, so code highlighting is likely inaccurate. Please feel free to reach out to a Codecov admin." etc.

Adal3n3 commented 2 months ago

Thanks for the nice suggestion @kfbustam. I have circled it back to my team and we will investigate to see if it's possible.