afnanenayet / diffsitter

A tree-sitter based AST difftool to get meaningful semantic diffs
MIT License
1.65k stars 28 forks source link

Options to only show functional changes in diffs #349

Open dgunay opened 2 years ago

dgunay commented 2 years ago

Is your feature request related to a problem? Please describe.

I noticed that diffsitter returns 0 when the two files have the same AST, which is awesome. However, it still produces output if there are nonfunctional changes. Example programs:

package main

import "fmt"

func main() {
    a := "string"
    fmt.Println(a)
}
package main

import "fmt"

// comments in the file
func main() {
    b := "string"
    fmt.Println(b)
}

Produces:

diffsitter a/main.go b/main.go                                                                                                                      main  
a/main.go -> b/main.go
======================

4:
--
+ // comments in the file

6 - 7:
------
+       b := "string"
+       fmt.Println(b)

5 - 6:
------
-       a := "string"
-       fmt.Println(a)

In the event that you have something like a mass refactor which you expect not to have functional changes, if one ends up showing up you are not easily able to find it.

Describe the solution you'd like

I would like it if there were a setting which, when enabled, would make diffsitter print only the changes which meaningfully alter the AST.

Describe alternatives you've considered

I thought about writing my own but it would only support Go. This project seems like a great base to bring this to many languages.

Additional context

I am not sure but I would suspect the various different semantics between languages may make this a somewhat complicated request (I don't know off the top of my head but I suspect, software being the way it is, that there may exist a language or framework where merely changing a variable name has functional impacts).

dgunay commented 2 years ago

err actually I guess the 101 status code I got was actually because I managed to make diffsitter panic:

package main

import "fmt"

func main() {
    a := "string"
    fmt.Println(a)
}
package main

import "fmt"

// comments in the file
func main() {
    b := "string"
    fmt.Println(b)
    c := "hi"
    fmt.Println(c)
}

caused:

thread 'main' panicked at 'byte index 16 is out of bounds of `  fmt.Println(b)`', src/formatting.rs:403:43
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
afnanenayet commented 2 years ago

Thanks for the bug report! And I'll think about more intelligent modes for diffsitter - we could possibly do something like only checking for diffs between tokens of a certain type. It's not super straightforward to me right now because sometimes changing an identifier can actually change the program.

As an example:

package main

import "fmt"

// comments in the file
func main() {
    b := "string"
    fmt.Println(b)
    c := "hi"
    fmt.Println(c)
}

to

package main

import "fmt"

// comments in the file
func main() {
    c := "string"
    fmt.Println(b)
    c := "hi"
    fmt.Println(c)
}

is a meaningful change, so simply filtering based on a token type wouldn't work here.

afnanenayet commented 2 years ago

@dgunay which version of diffsitter are you using and what platform are you on? I tried replicating your crash with this test: and it seems like it's getting a different result and not panicking: https://github.com/afnanenayet/diffsitter/pull/351

dgunay commented 2 years ago
$ diffsitter -h
diffsitter 0.7.0
Afnan Enayet <afnan@afnan.io>
An AST based difftool for meaningful diffs
OS: Arch Linux x86_64 
Kernel: 5.17.8-arch1-1 
Uptime: 4 hours, 13 mins 
Packages: 1868 (pacman), 10 (flatpak) 
Shell: fish 3.4.1 
Resolution: 1920x1080 
DE: GNOME 42.1 
WM: Mutter 
WM Theme: Nordic 
Theme: Nordic [GTK2/3] 
Icons: Adwaita [GTK2/3] 
Terminal: vscode 
CPU: AMD Ryzen 9 3900X (24) @ 3.800GHz 
GPU: AMD ATI Radeon RX 6800/6800 XT / 6900 XT 
Memory: 10483MiB / 32069MiB

I installed it from diffsitter-bin on the AUR using yay:

$ yay -Q diffsitter-bin
diffsitter-bin 0.7.1-1
afnanenayet commented 2 years ago

Hmm it seems like the binary is showing 0.7.0 but the package is for 0.7.1? I don't think it should make a difference in any case.

Could you do me a favor and run that with the backtrace environment variable turned on and with the --debug flag so I could see the full log