sigstore / model-transparency

Supply chain security for ML
Apache License 2.0
105 stars 28 forks source link

Make coverage reporting be visible to the PR #288

Closed mihaimaruseac closed 1 month ago

mihaimaruseac commented 1 month ago

Right now, to see the covered lines we need to go into the run output for one of the test workflows and scroll at the end, looking for the output of hatch test -c, something like:

Name                                                         Stmts   Miss  Cover   Missing
------------------------------------------------------------------------------------------
src/model_signing/hashing/hashing.py                            34      5    85%   53, 66, 72, 81, 86
src/model_signing/manifest/manifest.py                          78      1    99%   102
src/model_signing/serialization/serialization.py                 8      1    88%   49
src/model_signing/serialization/serialize_by_file.py           109      1    99%   198
src/model_signing/serialization/serialize_by_file_shard.py      80      1    99%   212
src/model_signing/signing/empty_signing.py                      39      2    95%   51, 97
src/model_signing/signing/in_toto.py                           169     68    60%   65-78, 181-190, 342-367, 485-512, 660-671, 793-806
src/model_signing/signing/signing.py                            25      5    80%   65, 78, 99, 126, 155
------------------------------------------------------------------------------------------
TOTAL                                                          773     84    89%

It would be ideal to have github formatted output showing up to files in the PRs on the lines that are not covered. This will help reviewers ensure that code coverage is high for the project.

youssef-itanii commented 1 month ago

Hello, I would like to taken on this issue

youssef-itanii commented 1 month ago

I noticed that the hatch test -c command does not really include the missing lines. Moreover, I noticed from this PR https://github.com/sigstore/model-transparency/pull/287 that it is also the case and the point of the PR is to fix this issue.

I guess one way to tackle this problem is to include the changes from that PR and then rebase & pull when the changes are included. Or I can wait for that PR to be merged so I could avoid conflicts. At the moment, I was able to comment the coverage, but needs some fixing as I am testing out different formats. I guess this part would be clear based on this question below:

Do you just want the table as a comment or do you want each line missing marked in the PR as a comment?

mihaimaruseac commented 1 month ago

Thank you for looking into this. Yes, right now this needs to wait for #287 to land first, though you can start developing on top of those changes.

For the output, we want something like the following (taken from another repo, using Haskell, using lint as example instead of coverage)

image

where each suggestion would be for a missing coverage issue (and similarly for lint, in the corresponding issue).

However, the github actions output should only work this way in GHA. For terminal usage we want to see the table as above, as that is the most readable.

youssef-itanii commented 1 month ago

Ah I see And how would you want it to look in the PR comments itself? Because if there were a lot of missing lines, then there would be a lot of comments and it would be hard to keep track of conversations in PRs (especially since the unit tests are running on different python versions and OSes)

youssef-itanii commented 1 month ago

another thing to point out, that I have tried to implement it the way you requested in the image, but according to the Github REST API

Creates a review comment on the diff of a specified pull request. To add a regular comment to a pull request timeline, see "Create an issue comment."

so it doesn't seem to be possible to add these review comments inside the code unless those lines are part of a file that encountered any change.

It is possible to reference code snippets that were not part of the PR by using a permalink and you will obtain the following: image

mihaimaruseac commented 1 month ago

Oh, I was thinking of using workflow commands for the output. It won't show as a comment in the PR discussion page, but as a comment on the code review page, exactly on the lines of code that matter.

This also solves the issue you mentioned above with a lot of comments on the PR. Now the comments are attached to the code and can be viewed only by people that do the review.

For multiple Python/OS versions, we can just select linux, py3.11 for the annotations.

youssef-itanii commented 1 month ago

Ah I see I am not familiar with workflow commands, so i'll have to look into that. If you are in a rush for it and prefer taking this issue, let me know :smile: if not, I'll start looking into it

As for what I was working on, I was able to make it much cleaner and update the coverage comments to avoid constant comments every time a change occurs.

I guess the only way to avoid having multiple comments due to different OSes and python versions is by selecting just one as you said and blocking the rest from being posted. In this case, you would have one comment that contains all the code lines that haven't been covered. Though, I'd probably prefer having them in a review, since it's easier to mark as resolved image

mihaimaruseac commented 1 month ago

There's no rush on this, take your time.

The workflow commands are moving the reports to the review view, instead of a comment on the PR, so it achieves the goal of having them show up exactly where needed.

youssef-itanii commented 1 month ago

Something similar to this? image

mihaimaruseac commented 1 month ago

Yes! That's the one