This changes the t-074 test to look for two consecutive single characters surrounded by dashes. As detailed in our discussion on the other PR, a single character between dashes finds more false positives than using two characters misses (e.g. this will allow to get rid of five ignore files that had to be created today).
I also got rid of the length limit, on the presumption that at least two characters is pretty likely to be a valid hit wherever it's found. (A run on the corpus turned up no false positives.)
(I had already done all this before I tried to pull the PR and saw you had made further changes to the test. Since I had covered more exclusions and had valid entries as well, I went ahead and left my test as it was. It covers everything yours did and more.)
I think a search for a "word" entirely consisting of single-characters and dashes is still something lint can try to catch, but I need to do some more testing before I propose anything. One of the problems is that what's found could be either an unitalicized sound (O-w-w-w-w) or a spelled-out word (w-o-r-d) without grapheme/phoneme tags on the letters, so the message will need to be ambiguous.
This changes the t-074 test to look for two consecutive single characters surrounded by dashes. As detailed in our discussion on the other PR, a single character between dashes finds more false positives than using two characters misses (e.g. this will allow to get rid of five ignore files that had to be created today).
I also got rid of the length limit, on the presumption that at least two characters is pretty likely to be a valid hit wherever it's found. (A run on the corpus turned up no false positives.)
(I had already done all this before I tried to pull the PR and saw you had made further changes to the test. Since I had covered more exclusions and had valid entries as well, I went ahead and left my test as it was. It covers everything yours did and more.)
I think a search for a "word" entirely consisting of single-characters and dashes is still something lint can try to catch, but I need to do some more testing before I propose anything. One of the problems is that what's found could be either an unitalicized sound (O-w-w-w-w) or a spelled-out word (w-o-r-d) without grapheme/phoneme tags on the letters, so the message will need to be ambiguous.