Relax whitespace sensitivity in Filecheck

bytecodealliance / cranelift

Cranelift code generator

https://cranelift.readthedocs.io/

2.49k stars 201 forks source link

Relax whitespace sensitivity in Filecheck #54

Closed angusholder closed 6 years ago

angusholder commented 7 years ago

Filecheck is currently very stringent in matching whitespace of tests exactly, which leads to undesirable situations of having to add a regex match to pickup excess spaces between identifiers.

For example return v0,v1 isn't valid and must be return v0, v1 which is subtle and goes against intuition.

Another situation is that the code that stringifies IR can insert many spaces in unexpected places, such that when round-tripped through the a parse and restringify, the following:

[-,-] v2 = iadd v0, v1

becomes

[-]                     v2 = iadd v0, v1

meaning that a test to match it must look like the following:

; regex: WS=[ \t]*
; nextln: [-]$WS $v2 = iadd $v0, $v1

We should examine where (if anywhere) whitespace sensitivity is suitable, and remove it every else so that writing these common test cases has less friction.

stoklund commented 7 years ago

Thanks.

I think it would be ok to let filecheck treat all white space as the same, so essentially translate a sequence of spaces and tabs in an input pattern into the regex [ \t]+ when matching.

I think it is harder to treat v1,v2 and v1, v2 as equivalent since filecheck is just a text matcher that doesn't understand the syntax of the language it is matching on. For example, x++y and x+ +y are different things in C.

Also, we shouldn't allow random vertical white space in matches. This would confuse the sameln and nextln directives. Stick to spaces and tabs.

zummenix commented 7 years ago

If we're speaking about https://github.com/stoklund/cretonne/blob/22ad3c0bf8e8f0f64470e471c19f1b8c0ec2c799/filetests/parser/instruction_encoding.cton we can rewrite it without whitespace regex:

; sameln: function foo(i32, i32) {
; nextln: $ebb1($v0: i32, $v1: i32):
; nextln:     [-,-]
; sameln:                           $v2 = iadd $v0, $v1
; nextln:     [-]
; sameln:                           trap
; nextln:     [R#1234,%x5,%x11]
; sameln:                           $v6, $v7 = iadd_cout $v2, $v0
; nextln:     [Rshamt#beef,%x25]
; sameln:                           $v8 = ishl_imm $v6, 2
; nextln:     [-,-]
; sameln:                           $v9 = iadd $v8, $v7
; nextln:     [Iret#05]
; sameln:                           return $v0, $v8
; nextln: }

In my opinion it looks better and in general shows what is going on with formatting.

sunfishcode commented 6 years ago

Filecheck lives in a separate repo now, so I opened https://github.com/Cretonne/filecheck/issues/2 to continue to track this issue.