src-d / style-analyzer

Lookout Style Analyzer: fixing code formatting and typos during code reviews
GNU Affero General Public License v3.0
32 stars 21 forks source link

Clean quality report repositories CSV file #716

Open m09 opened 5 years ago

m09 commented 5 years ago

cf further discussion in this issue (that happened before this edition, cf original issue text): Currently, we list the repositories that we use in the quality report in a CSV file that has 3 columns: url, to and from. Only url and from are used. We should therefore remove the to column and rename from to commit_hash.

Original issue text Excerpt from `quality_report_repos.csv`: url,to,from https://github.com/reduxjs/redux,902484ed735d38aec06683c847810a7218d8dba2,b307091af4d7e846d9e5f080fb81caf4a8b4aab1 In that repo, one of the files we make predictions for (and actually get errors from) is `examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js`. It is present in the `from` revision: ``` $ git show b307091af4d7e846d9e5f080fb81caf4a8b4aab1:examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js // @flow import { setVisibilityFilter } from '../../actions/visibilityFilter'; import visibilityFilter from '../../reducers/visibilityFilter'; describe('visibilityFilter', () => { it('should handle initial state', () => { expect(visibilityFilter(undefined, { type: '@@INIT' })).toEqual('SHOW_ALL'); }); it('should handle SET_VISIBILITY_FILTER', () => { expect( visibilityFilter(undefined, setVisibilityFilter('SHOW_ACTIVE')) ).toEqual('SHOW_ACTIVE'); }); }); ``` but not in the `to` revision: ``` $ git show 902484ed735d38aec06683c847810a7218d8dba2:examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js fatal: Path 'examples/todos-flow/src/__tests__/reducers/visibilityFilter.test.js' does not exist in '902484ed735d38aec06683c847810a7218d8dba2' ``` Therefore we should not make predictions for it.
zurk commented 5 years ago

Did you get it from make report-quality?

We do not use to column from the related csv file. Only to make lookout work. So It uses from column for train and analyze by design.

m09 commented 5 years ago

Thanks for the precisions :)

To quote the comment I made on slack:

We should document that we don't use the to column (and probably remove it from csv if possible)

zurk commented 5 years ago

Good point. We should try. Last time I think about removal it was tricky because it is required to feed two unequal commits to lookout-sdk review call.

m09 commented 5 years ago

The from commit and HEAD should work if I remember correctly. from should also probably be renamed to something like rev since it'll be the only one we'll be using.