src-d / style-analyzer

Lookout Style Analyzer: fixing code formatting and typos during code reviews
GNU Affero General Public License v3.0
32 stars 21 forks source link

Add annotations to FeatureExtractor #761

Closed zurk closed 5 years ago

zurk commented 5 years ago

This PR adds Annotation tool to use instead of VirtualNode class (issue https://github.com/src-d/style-analyzer/issues/521). PR contains:

I also run a regression test to see if there any difference in old vnodes and new (restored) vnodes for quality report. I did not include the test to the PR. It can be found here: https://github.com/zurk/style-analyzer/commit/3aed7d1a7edf7e36e3155d372ceeae7928fcc85c I see some improvements like there were a couple of bugs related to column position or mixed intention and the rest is the same. Also, the quality was almost the same for the first 15 repos from the report.

The comparison with v0.2.0 report

# Train Report comparison

|               repo |      precision |         recall |    full_recall |             f1 |        full_f1 |           ppcr |          support |     full_support |   Rules Number |   Average Rule Len |
|-------------------:|---------------:|---------------:|---------------:|---------------:|---------------:|---------------:|-----------------:|-----------------:|---------------:|-------------------:|
|          telescope | 1.000 (+0.000) | 1.000 (+0.000) | 0.210 (+0.000) | 1.000 (+0.000) | 0.348 (+0.000) | 0.210 (+0.000) |    203 (     +0) |    965 (     +0) |       2 (  +0) |         2.0 (+0.0) |
|              carlo | 0.934 (-0.043) | 0.934 (-0.043) | 0.623 (+0.053) | 0.934 (-0.043) | 0.747 (+0.027) | 0.667 (+0.083) |   5398 (   +670) |   8095 (     -3) |      34 ( +25) |         5.9 (+0.2) |
|          reveal.js | 0.965 (-0.010) | 0.965 (-0.010) | 0.655 (-0.014) | 0.965 (-0.010) | 0.781 (-0.012) | 0.679 (-0.006) |  10068 (    -97) |  14829 (     +0) |      19 (  -2) |         9.1 (-0.7) |
|   create-react-app | 0.977 (+0.000) | 0.977 (+0.000) | 0.474 (-0.001) | 0.977 (+0.000) | 0.639 (+0.000) | 0.485 (-0.002) |  11230 (    -28) |  23140 (     +0) |      14 (  +0) |         6.4 (+0.0) |
|              axios | 0.980 (+0.000) | 0.980 (+0.000) | 0.760 (+0.000) | 0.980 (+0.000) | 0.856 (+0.000) | 0.776 (+0.000) |  21335 (     +0) |  27482 (     +0) |      14 (  +0) |         7.4 (+0.0) |
|              redux | 0.972 (+0.000) | 0.972 (+0.000) | 0.571 (-0.003) | 0.972 (+0.000) | 0.719 (-0.003) | 0.588 (-0.002) |  23450 (   -104) |  39893 (     +0) |      17 (  +0) |         6.6 (-0.1) |
|              citgm | 0.962 (+0.000) | 0.962 (+0.000) | 0.825 (+0.000) | 0.962 (+0.000) | 0.888 (+0.000) | 0.857 (+0.000) |  20523 (     +0) |  23944 (     +0) |      59 (  +0) |         7.0 (+0.0) |
|          evergreen | 0.974 (+0.001) | 0.974 (+0.001) | 0.827 (+0.000) | 0.974 (+0.001) | 0.894 (+0.000) | 0.849 (-0.001) |  44342 (    -53) |  52233 (     +0) |     197 ( -20) |        11.0 (+0.0) |
| 30-seconds-of-code | 0.948 (+0.000) | 0.948 (+0.000) | 0.889 (+0.000) | 0.948 (+0.000) | 0.918 (+0.000) | 0.938 (+0.000) |  64537 (    -23) |  68832 (     +1) |      48 (  -3) |         7.9 (-0.2) |
|            express | 0.977 (+0.000) | 0.977 (+0.000) | 0.854 (+0.000) | 0.977 (+0.000) | 0.911 (+0.000) | 0.874 (+0.000) |  75137 (     +0) |  85947 (     +0) |      27 (  +0) |         7.8 (+0.0) |
|       freeCodeCamp | 0.960 (+0.000) | 0.960 (+0.000) | 0.828 (+0.000) | 0.960 (+0.000) | 0.889 (+0.000) | 0.863 (+0.000) | 105318 (     +0) | 122044 (     +0) |     153 (  -3) |        12.5 (+0.0) |
|          storybook | 0.973 (+0.000) | 0.973 (+0.000) | 0.832 (+0.000) | 0.973 (+0.000) | 0.897 (+0.000) | 0.855 (+0.000) | 163452 (     +4) | 191183 (     +0) |      57 (  +0) |         9.5 (-0.1) |
|             jquery | 0.977 (-0.002) | 0.977 (-0.002) | 0.885 (+0.006) | 0.977 (-0.002) | 0.929 (+0.003) | 0.907 (+0.009) | 202535 (  +1987) | 223414 (    +30) |     175 (+142) |        12.2 (+2.5) |
|            core-js | 0.985 (+0.000) | 0.985 (+0.000) | 0.959 (-0.001) | 0.985 (+0.000) | 0.972 (+0.000) | 0.974 (-0.001) | 302419 (   -305) | 310633 (     +0) |      66 (  -1) |         8.8 (+0.0) |
|               atom | 0.988 (+0.017) | 0.988 (+0.017) | 0.765 (-0.149) | 0.988 (+0.017) | 0.863 (-0.079) | 0.775 (-0.166) | 460543 ( +34428) | 594570 (+141858) |      62 (-326) |         9.2 (-1.7) | 

# Test Report comparison
|               repo |      precision |         recall |    full_recall |             f1 |        full_f1 |           ppcr |         support |    full_support |   Rules Number |   Average Rule Len |
|-------------------:|---------------:|---------------:|---------------:|---------------:|---------------:|---------------:|----------------:|----------------:|---------------:|-------------------:|
|          telescope | 0.826 (+0.000) | 0.826 (+0.000) | 0.457 (+0.000) | 0.826 (+0.000) | 0.588 (+0.000) | 0.553 (+0.000) |   172 (     +0) |   311 (     +0) |       2 (  +0) |         2.0 (+0.0) |
|              carlo | 0.883 (-0.064) | 0.883 (-0.064) | 0.756 (-0.005) | 0.883 (-0.064) | 0.814 (-0.030) | 0.856 (+0.052) |  1705 (   +104) |  1992 (     +0) |      34 ( +25) |         5.9 (+0.2) |
|          reveal.js | 0.949 (-0.005) | 0.949 (-0.005) | 0.774 (-0.004) | 0.949 (-0.005) | 0.853 (-0.004) | 0.816 (+0.001) |  8813 (    +10) | 10799 (     +0) |      19 (  -2) |         9.1 (-0.7) |
|   create-react-app | 0.912 (-0.002) | 0.912 (-0.002) | 0.746 (+0.002) | 0.912 (-0.002) | 0.821 (+0.001) | 0.818 (+0.003) |  3519 (    +14) |  4303 (     +0) |      14 (  +0) |         6.4 (+0.0) |
|              axios | 0.968 (+0.000) | 0.968 (+0.000) | 0.876 (+0.000) | 0.968 (+0.000) | 0.920 (+0.000) | 0.905 (+0.000) |  5166 (     +0) |  5707 (     +0) |      14 (  +0) |         7.4 (+0.0) |
|              redux | 0.924 (-0.001) | 0.924 (-0.001) | 0.828 (-0.012) | 0.924 (-0.001) | 0.874 (-0.006) | 0.896 (-0.012) |  4892 (    -64) |  5461 (     +0) |      17 (  +0) |         6.6 (-0.1) |
|              citgm | 0.921 (+0.000) | 0.921 (+0.000) | 0.910 (+0.000) | 0.921 (+0.000) | 0.915 (+0.000) | 0.988 (+0.000) |  5047 (     +0) |  5107 (     +0) |      59 (  +0) |         7.0 (+0.0) |
|          evergreen | 0.926 (+0.001) | 0.926 (+0.001) | 0.864 (+0.000) | 0.926 (+0.001) | 0.894 (+0.001) | 0.932 (-0.002) | 19934 (    -44) | 21381 (     +0) |     197 ( -20) |        11.0 (+0.0) |
| 30-seconds-of-code | 0.973 (+0.000) | 0.973 (+0.000) | 0.973 (+0.000) | 0.973 (+0.000) | 0.973 (+0.000) | 1.000 (+0.000) | 11493 (     +0) | 11493 (     +0) |      48 (  -3) |         7.9 (-0.2) |
|            express | 0.958 (+0.000) | 0.958 (+0.000) | 0.890 (+0.000) | 0.958 (+0.000) | 0.922 (+0.000) | 0.929 (+0.000) | 14453 (     +0) | 15557 (     +0) |      27 (  +0) |         7.8 (+0.0) |
|       freeCodeCamp | 0.929 (+0.000) | 0.929 (+0.000) | 0.894 (+0.000) | 0.929 (+0.000) | 0.911 (+0.000) | 0.962 (+0.000) | 23738 (     +0) | 24670 (     +0) |     153 (  -3) |        12.5 (+0.0) |
|          storybook | 0.957 (+0.000) | 0.957 (+0.000) | 0.899 (+0.000) | 0.957 (+0.000) | 0.927 (+0.000) | 0.939 (+0.000) | 33969 (     -1) | 36185 (     +0) |      57 (  +0) |         9.5 (-0.1) |
|             jquery | 0.969 (-0.008) | 0.969 (-0.008) | 0.933 (+0.002) | 0.969 (-0.008) | 0.950 (-0.003) | 0.963 (+0.010) | 46390 (   +478) | 48173 (     +0) |     175 (+142) |        12.2 (+2.5) |
|            core-js | 0.982 (+0.000) | 0.982 (+0.000) | 0.974 (-0.001) | 0.982 (+0.000) | 0.978 (-0.001) | 0.992 (-0.001) | 65767 (    -69) | 66293 (     +0) |      66 (  -1) |         8.8 (+0.0) |
|               atom | 0.961 (+0.012) | 0.961 (+0.012) | 0.937 (-0.007) | 0.961 (+0.012) | 0.949 (+0.002) | 0.975 (-0.019) | 88392 ( +70634) | 90636 ( +72776) |      62 (-326) |         9.2 (-1.7) |

I am going to run a full report for the final version after we merge.

zurk commented 5 years ago

@vmarkovtsev ping

zurk commented 5 years ago

Travis failed because of modelforge issues that should be fixed in https://github.com/src-d/modelforge/pull/98

zurk commented 5 years ago

@vmarkovtsev all good now?

zurk commented 5 years ago

About https://github.com/src-d/style-analyzer/pull/761#discussion_r286489833 I add a comment. if you read both docstrings you should see the difference.

https://github.com/src-d/style-analyzer/pull/761/commits/d8793b49593a0640e1335070a9b265ff2e1cdbe0#diff-39b3b19d4276f053126096271b399f7fR862

vmarkovtsev commented 5 years ago

@zurk

The docstring should explain how it is different from file_to_old_parse_file_format: the input and output types are exactly the same.

I do not see the magic keywords: "This function is different from file_to_old_parse_file_format because...". I do not want to diff two docstrings, that's too hard for me :)

zurk commented 5 years ago

ok, I add it. tell me if there is something more.

If not, let me squash all review commits and rebase on current master before the merge.

zurk commented 5 years ago

Finally! 🎉

zurk commented 5 years ago

Now I need to build a new report for the current master.