Wilfred / difftastic

a structural diff that understands syntax 🟥🟩
https://difftastic.wilfred.me.uk/
MIT License
21.18k stars 347 forks source link

Language Detection based only on the second file causes inconvenience in certain workflows #783

Closed zzh1996 closed 1 week ago

zzh1996 commented 1 week ago

Description

Currently, Difftastic relies on the second file to determine the programming language for the diff (Language Detection feature). However, in various use cases, this behavior can be inconvenient, especially when the second file contents are piped dynamically or don't have an associated file extension. Here’s a typical example of such a scenario:

$ difft 1.c <(pbpaste)

In the above case, since the second file comes from pbpaste (through process substitution), the language detection doesn't recognize it as C code, even though both files contain the same content. This causes the diff to fallback to a plain text comparison, completely bypassing the syntax tree-based diff.

For example, when comparing the same content between 1.c (a C file) and a file 2 with no extension, the language detection leads to inconsistent behavior depending on the order of the files:

$ difft 1.c 2
2 --- Text
No changes.

$ difft 2 1.c
1.c --- C
No changes.

The same occurs with arbitrary file names or content piped dynamically.

Steps to Reproduce

  1. Create a C file (1.c) with any C code content.
  2. Compare it with another file (2) that contains the same content but lacks an extension, or compare it with dynamic input, i.e., via process substitution (<(pbpaste)).

Expected behavior: The language should be inferred from the file having a recognizable extension.

Additional Information

$ difft --version
Difftastic 0.61.0
Toolchain: 1.82.0
System:    macos aarch64
Wilfred commented 1 week ago

Thanks for the report :)