talesforce commented 6 years ago

Over the last few weeks and since the introduction of new scorecards, I have sampled data on last 200 scorecards filled by multiple reviewers across different challenges. I noticed the following statistics

Code Quality

1.2.1 Does the submission follow standard coding best practices? 1.2.2 Does the submission include an appropriate amount of comments?

These sections are blank for 80% of the scorecards - Having been a reviewer for last 5+ years, I've always emphasised on code reviews and I try to do it thoroughly for challenges which I review. Among other reviewers, I know wcheung and vvvpig do so. But I've come across several reviewers who never even look at the code - which is appalling to say the last.

Code quality is a key metric for building a high quality application and that seems to be constantly ignored by several reviewers. I see this especially on iOS challenges where non iOS developers get picked for reviews (just to ensure all reviewers have equal distribution) and all they do is run the submissions on simulators without looking at nuances of the code/ architecture.

Security & Performance

1.2.3 Has obsolete or unnecessary code been cleaned up? 1.2.4 Has reasonable consideration been given to security? 1.2.5 Has reasonable consideration been given to performance?

The scores on these questions have NOT been used for >90% challenges. I'm not really sure what was the tipping point to get these included in the scorecard but the current stats on usage for these is not very encouraging for sure.

I'd be interested in hearing thoughts from the current CAB on this - @lsentkiewicz @ThomasKranitsas @birdofpreyru ?

Just to be clear, I'm not pointing fingers or looking for bashing of individuals/ groups, I'm looking for constructive inputs which would help improve reviews and bring back the focus on code quality.

cc @dmessing @hokienick @rootelement

birdofpreyru commented 6 years ago

I believe, your question already contains one of the answers: to provide an adequate feedback on these questions reviewers should be really into the nuances of technology / problem, and potentially spend more time on the review. Apparently, in many cases many reviewers prefer just to skip these sections and focus on verification that solution work, and on catching user-facing issues. IMHO introduction of feedback on reviews / reviewer ranking, etc. (#52, and some other related tickets) will help to mediate this problem; but it sure not coming any soon.

Regarding security / performance, I believe, in majority of our day-to-day projects it is quite irrelevant. If one does not make a huge (probably intentional) mistake to open a huge security whole / create a big performance issue, in general security / performance stay within some reasonable limits, and further enhancing it demands a special efforts (dedicated separate challenge(s)), if relevant.

I might be wrong, but in my understanding the main reason to include 1.2.4 and 1.2.5 into the scorecard lies in the domain of marketing. So that one can show the scorecard to customer and say: yeah, security and performance is always at the list of focus for Topcoder projects.

lstkz commented 6 years ago

(just to ensure all reviewers have equal distribution)

Is it true that good and bad reviewers do the same number of reviews?

talesforce commented 6 years ago

I don't know how currently 'bad' reviewers are identified. I know copilots report them to support and at least 1 reviewer was suspended for consistently doing poor reviews, but that was a one-off - I know few others who've been reported but they continue to be on the review board.

And yes, to answer your question, since the review allocation is manual, I believe it's done in a manner so as to keep the distribution fair by number of reviews.

I know that a reviewer certification program is in progress but I don't know when that will go live.

My intent of raising this thread is more for reviewer education and the quality of code reviews as part of challenge reviews.

mishacucicea commented 6 years ago

I think opening new reviewer positions would also help, for example, I've tried applying one year ago, but entrance was closed. There are new competitors every day, some of them turn out to be very good, I think having a system of "trial reviewers" every few months, would encourage new people to raise the quality of reviews, this, in turn, motivating existing ones to be more careful. Think at challenges that have only one submission, the probability that the submission is of average quality is high, whereas if there are 5 submissions, than it's most probable you're getting good quality. This is what makes the platform great, it's competition, reviewers should feel the same as competitors (e.g. "I need to give my best"). Another potential improvement could be the strategy used to select reviewers for a challenge, one to have the main focus (not exclusively) on functionality and UI, the other on code quality, performance, architecture. This way different issues with the submission could be highlighted, instead of doing 80-90% the same thing. A nice think I've noticed about a few reviewers is that they give hints on how to fix or approach issues. I think this helps in growing both competitors and reviewers.

lstkz commented 6 years ago

@talesforce We don't have many good reviewers. In the past, there were 3 reviewers assigned, and usually, it was 2 good reviewers + 1 average. Currently, participating in topcoder challenges is a nightmare. If the challenge is simple, and the "equal distribution manager" picks two bad reviewers, then the review results can be random.

The another problem is that there is no feedback. The reviewers don't know if they do a good or a bad review. I wrote about this problem 2 years ago, and I just simply write about the same things all the time. The simplest way is to just write a short feedback (good or bad), send to them through the copilot. It can be just a friendly discussion without any scoring or rating system.

I remember I did that in CSFV contests. The copilot asked me to provide the feedback about the reviewers, and it looked like this:

reviewerX: good review, found many issues
reviewerY: good review, found many issues, but less than reviewerX
reviewerZ: bad review, missed many issues, many responses were invalid

I tried to do in the drones series, but all my emails were ignored 😉

@birdofpreyru The problem with the reviewer rating (#52) is that you must wait 5-8 month to see the results. After 8 months, there will be another "idea" how to improve it, and you must wait another 8 months. Many smart people were trying to create such system in the past.

@mishacucicea The problem with new reviewers is that they don't do good reviews at the beginning, and it can affect the review results. Also, we have only 2 reviewers, not 3.

topcoderinc / cab

Analysis of Last 200 Code Scorecards Filled/ Importance of Code Reviews #82

Code Quality

Security & Performance