google / git-appraise

Distributed code review system for Git repos
Apache License 2.0
5.14k stars 146 forks source link

Using Robot Comments for code coverage #16

Open gluck opened 8 years ago

gluck commented 8 years ago

Hi,

we're evaluating the use of Robot Comments for code coverage in the context of Gerrit.

Have you ever thought about such a use ? Any recommendation on the Shipshape format to use ?

And an extra question, we're wondering about the benefit of storing the url to a (standardized) shipshape response, versus storing the coverage/comments (potentially using the same format) directly in the robot-comment itself. What was the motivation for preferring the former ? (too much data ?)

Thx !

ojarjur commented 8 years ago

I don't have an opinion on using Shipshape to store coverage information, but to answer your second question: Yes the reason we are storing the analysis results outside of the repo is to avoid unnecessarily increasing the size of the git-notes.

More specifically, analysis data is more ephemeral than code change or review history. You generally only care about the warnings in the latest commit, rather than the warnings for all commits for all time. However, if we stored these directly in notes, then when you ran "git appraise pull" the first time, then it would download every analysis warning from every commit from all time, even though you don't need that data. Since we store the data separately, we can avoid that, and you only have to fetch the analyses that you actually care about.

gluck commented 8 years ago

Thanks for the quick reply Omar !

Sorry for the (hopefully constructive) criticism, but I'm having a hard time understanding with the approach :see_no_evil: and here's why:

Please bear with me here, I'm fairly new to git notes, and trying to find the best approach for this in Gerrit.

Thanks for your help !

barbastan commented 8 years ago

Heads up: Omar is on vacation, returning Jan 4th. If he doesn't reply promptly, that's why.

Thanks! Barbara

On Mon, Dec 21, 2015 at 6:30 AM, Francois Valdy notifications@github.com wrote:

Thanks for the quick reply Omar !

Sorry for the (hopefully constructive) criticism, but I'm having a hard time understanding with the approach [image: :see_no_evil:] and here's why:

  • argument to avoid (always) pulling the data sounds fair, but it'd be also answered if you were to store robot comments in another ref than refs/notes/devtools/* (but still in notes)
  • having the URL w/o extra description makes it hard to know if you should fetch it or not (assuming you use it to store various tools/robot results, which the name suggests)
  • couldn't you have stored a summary of the tool analysis in devtools, with an optional link/ref (outside devtools) for details ? (fetch it if you want to know more, or fetch 'em all if you want to fully replicate the repo, e.g. backup/replication)

Please bear with me here, I'm fairly new to git notes, and trying to find the best approach for this in Gerrit.

Thanks for your help !

— Reply to this email directly or view it on GitHub https://github.com/google/git-appraise/issues/16#issuecomment-166317842.

ojarjur commented 8 years ago

@gluck We absolutely should have included some sort of summary field in the robot analyses; that was just an oversight on my part, and I intend to fix soon.

Basically, the robot comments message should have a "status" field similar to the continuous integration message. The only difference is that instead of the enum values being "success" or "failure", they should be something like "lgtm", "nmw", or "fyi" (the same statuses used for human comments).

As for using another notes ref for the fine grained details; there are two options you could consider:

  1. Still use a single notes ref for all commits, but have two versions: a notes ref for summaries, and one for fine grained details. This would still have the issue with that would be that you would still have to fetch the fine-grained details for all commits rather than just fetching the details for the individual commits you care about.
  2. Have a single notes ref for the summaries for all commits, but a separate notes ref for the details for each individual commit (e.g. "refs/notes/devtools/commits//analysesDetails"). This might be a good approach, but I wouldn't feel comfortable with doing that until after doing a load test and verifying that it doesn't cause issues with the major git hosting providers to have tens of thousands of notes refs.
ojarjur commented 8 years ago

FYI: I created https://github.com/google/git-appraise/issues/29 to track adding the status field to the robot comments metadata format.

gluck commented 8 years ago

Indeed I didn't think about the fact that having a single note for detailed would force clients to fetch them all-or-none.

Your second approach sounds good, although not to impact git-appraise, I assume it'd have to be outside of refs/notes/devtools right ? Using the notes refs/notes/devtools/commits/SHA1/analysesDetails will make git-appraise fetch them, am I correct ?

I'll try to gather evidence/bench against such use of git notes.

Thanks !

ojarjur commented 8 years ago

Using the notes refs/notes/devtools/commits/SHA1/analysesDetails will make git-appraise fetch them, am I correct ?

Yes, it would. I suppose putting it outside of devtools (like your proposed "refs/notes/devtoolsdetailed" prefix) would be safer, and if we do want git-appraise to automatically pull those in the future, then we could always add that to the fetchspec.

Anyway, let me know how your testing/benchmarking goes. If it doesn't cause scaling issues, then we can figure out how to add support for it to git-appraise.