Open eernstg opened 1 year ago
Some examples of tests with specialized characters are given here: https://github.com/dart-lang/co19/issues/2242#issuecomment-1700621605.
@athomas, WDYT? Do we already have a mechanism which would ensure that certain (encoding sensitive) tests aren't modified significantly by accident? (Some editor or some revision management system might well try to "clean up" whitespace and similar elements of a source file.) Otherwise, would it be worth the trouble to do it?
If the file is binary
in .gitattributes
, git shouldn't touch it at least. I don't know about IDEs. As long as the test fails if cleaned up that should provide enough signal?
But it wouldn't necessarily fail. For instance, if we wish to ensure that the parser will not crash when it encounters a library whose file representation uses non-standard line endings (for example, some lines ending in \r\n
and some lines ending in \n
) and it is changed to a standard \n
encoding then it surely won't fail after the "normalization".
Maybe the easiest would be to have a second test that reads the file and checks that the bytes match an expected pattern? It wouldn't work on analyzer/cfe configurations but the VM would run the second test and fail if the bytes are mismatching.
I'm a bit reluctant to add a special case to the infrastructure for something we'll rarely need.
The syntax of Dart is specified in such a way that lines can end in
\n
,\r
, or\r\n
, which implies that any sequence of\n
and\r
in any order is accepted as whitespace and henceforth ignored (except that the existence of whitespace is used in one situation to disambiguate an otherwise ambiguous term; but even that doesn't depend on which kind of whitespace it is).We have a few tests whose purpose is to verify that the tools do indeed accept line endings in source code as specified.
In addition, we also have some tests whose purpose is to test the treatment of string literals (containing UTF-8 text).
With respect to line endings, the tests can be destroyed by running
dos2unix
or a utility like that, because they won't test anything of interest after being converted. A manual inspection may very well fail to see the difference, because line endings (of any kind) are normally not shown. The situation is similar for other tests containing specialized characters.Would it be worth the effort to take a
sha1sum
of such tests, store those sha1 values somewhere (perhapsfoo_A01_t01.dart
could have its sha1 stored infoo_A01_t01.sha1
), and run a presubmit script that checks the sha1 if the corresponding library has been modified?The approach could also make sense for non-source-code artifacts (images etc.) which are part of a test.