Open lazy opened 11 months ago
Oh my goodness, that's awful. Thanks for the report.
This is a display issue: the diffing logic finds changes, but since Python uses indentation for blocks it's not finding anything to display in this case.
$ DFT_LOG=trace difft a.py b.py
2023-10-25T15:51:54.861Z INFO difft::options > CLI arguments: ["a.py", "b.py"]
2023-10-25T15:51:54.875Z INFO difft::diff::dijkstra > LHS nodes: 16 (1 toplevel), RHS nodes: 16 (1 toplevel)
2023-10-25T15:51:54.875Z INFO difft::diff::dijkstra > Saw 58 vertices (a Vertex is 80 bytes), arena consumed 7424 bytes, with 33 vertices left on heap.
2023-10-25T15:51:54.875Z DEBUG difft::diff::dijkstra > Found a path of 9 with cost 709.
2023-10-25T15:51:54.875Z DEBUG difft::diff::dijkstra > Initial 5 items on path: [
"line:2 ... line:2 ... --- 100 EnterUnchangedDelimiter { depth_difference: 0 }",
"line:2 if line:2 ... --- 300 EnterNovelDelimiterRHS",
The display logic is assuming that you can't have a structural change without having some novel tokens, which isn't true here.
Every token has a matching token on the other side, which is interpreted as no changes. See hunks::sorted_novel_positions
.
E.g. here's how delimiters are handled in the analogous JS.
Should probably mark the first node in the changed subtree as novel when this occurs.
The Python standard library tokenizer module (which you can test with python3 -mtokenize program.py
) does produce INDENT
and DEDENT
tokens; however the dedent tokens it produces have always zero-width lexed content, yes, but the tokenizer marks them found at the new indent depth; however the lexed text for indent token contains the total whitespace from column 1 to the beginning of indented block.
Would you like a separate issue filed for the corresponding error in F#, which is also whitespace-sensitive? I couldn't find one from a quick search, but would like to note the following test case (translated from the Python original).
let test a b =
if a then
printfn "%b" a
if b then
printfn "%b" b
and
let test a b =
if a then
printfn "%b" a
if b then
printfn "%b" b
displays the same "No syntactic changes" (Difftastic 0.59.0 (built with rustc 1.78.0)
).
Thanks for the sample code 🙂
No, fixing this will affect all languages that have significant whitespace, it doesn't need a separate issue.
I verified via
--dump-syntax
that ASTs are significantly different. Can compare them usingpr -m -t -w 160 <(difft --dump-syntax a.py) <(difft --dump-syntax b.py)
Issue duplicated as text:
Using the latest x64 build from releases page in github.