errata-ai / Microsoft

A Vale-compatible implementation of the Microsoft Writing Style Guide.
https://github.com/errata-ai/vale
MIT License
85 stars 46 forks source link

Some rules don't seem to work as expected #10

Closed felicitymay closed 5 years ago

felicitymay commented 5 years ago

My test file has some content that I'm expecting to trigger the following rules:

I'm not sure if I've simply misunderstood the behavior of the rules, or if there are some bugs in the implementation. You can see my test file and results in this gist. The gist also lists the additional results that I expected to see.

If you're able to give me any help working out why my test file isn't triggering these rules that would be great.

jdkato commented 5 years ago

Here's what I get using your test file:

 test.md
 3:4    suggestion  'Bad Acronyms Heading'          Microsoft.Headings        
                    should use sentence-style                                 
                    capitalization.                                           
 5:23   suggestion  'SEO' has no definition.        Microsoft.Acronyms        
 5:67   error       Use "they're" instead of "they  Microsoft.Contractions    
                    are".                                                     
 6:27   suggestion  'SEO' has no definition.        Microsoft.Acronyms        
 8:4    suggestion  'another bad heading for        Microsoft.Headings        
                    back-end tests' should use                                
                    sentence-style capitalization.                            
 10:24  warning     Use 'back-end' instead of       Microsoft.Backend         
                    'back end'                                                
 10:50  warning     Use 'back-end' instead of       Microsoft.Backend         
                    'back end'                                                
 11:30  warning     Use 'back end' instead of       Microsoft.Backend         
                    'back-end'                                                
 15:55  suggestion  'ISO' has no definition.        Microsoft.Acronyms        
 15:67  suggestion  'YYYY' has no definition.       Microsoft.Acronyms        
 16:1   error       Use "don't" instead of "Do      Microsoft.Contractions    
                    not".                                                     
 16:14  error       Always spell out the name of    Microsoft.DateOrder       
                    the month.                                                
 18:4   warning     Avoid using acronyms in a       Microsoft.HeadingAcronyms 
                    title or heading.                                         
 21:14  suggestion  'SSG' has no definition.        Microsoft.Acronyms        
 22:1   error       Punctuation should be inside    Microsoft.Quotes          
                    the quotes.                                               
 22:59  suggestion  'being quoted' looks like       Microsoft.Passive         
                    passive voice.                                            
 33:50  warning     For a general audience, use     Microsoft.GeneralURL      
                    'address' rather than 'URL'.                              
 34:14  suggestion  'COBOL' has no definition.      Microsoft.Acronyms        

✖ 4 errors, 5 warnings and 9 suggestions in 1 file.

Do you have MinAlertLevel = suggestion in your .vale.ini file?

felicitymay commented 5 years ago

Ah - what a rookie error. I had it set to warning... Many thanks. That's almost exactly what I was expecting now.

The only thing still doesn't seem quite right is I think that line 10:24 is correct - "the back end" and so it shouldn't get a warning (but the other two warnings are exactly what I expected to see).

Thanks for your quick response to my misconfiguration.

jdkato commented 5 years ago

The Backend.yml rule is actually working correctly; the problem is that both instances of back end at line 10 are being tagged as back/JJ end/NN. Here's an example:

Screen Shot 2019-04-19 at 10 13 17 PM

I'd say that this most likely is due to the sentence being somewhat of a contrived edge case.

Ultimately, though, the underlying tagging implementation is "only" around 90% accurate. So, the usefulness of this rule really depends on how you view false positives: I'm personally fine with suggestion- and warning-level alerts needing to be manually reviewed, but I know some people feel otherwise.

felicitymay commented 5 years ago

Thanks for the explanation and your test file. It looks as if it works really well in practice and that my test file is just not realistic for the text that we want to test.

I really appreciate your quick responses and patience. I think we should close this issue as clearly the only problems were with my understanding.