Closed tstromberg closed 3 days ago
I think this comes down to the added/removed logic. The first file will be stored in the map with the key ls
and we'll use that to do the lookup for ls.x86_64
and it won't match so it will be treated as an add/remove rather than a modify.
Using the diff example above:
relPath: ls
fromMap: map[ls.x86_64:0xc0001a4140]
Here's where we store the relative path in the map: https://github.com/chainguard-dev/malcontent/blob/29fb875cbc017a6955a5001f8acb4d48141110f7/pkg/action/diff.go#L60
The lookup in processSrc
:
https://github.com/chainguard-dev/malcontent/blob/624befeed38fe9981ac3fbd8d9b0bb2bcaa13cae/pkg/action/diff.go#L128-L132
The lookup in processDest
:
https://github.com/chainguard-dev/malcontent/blob/29fb875cbc017a6955a5001f8acb4d48141110f7/pkg/action/diff.go#L126-L130
If I store the value of rel
with any extension trimmed, this works as described above:
$ go run cmd/mal/mal.go diff ../malcontent-samples/linux/clean/ls.x86_64 ../malcontent-samples/macOS/clean/ls
├─ 🔵 Changed: ../malcontent-samples/macOS/clean/ls [MEDIUM → LOW]
│ X data [LOW → NONE]
--- 🔵 compression/lzma — works with lzma files
│ X discovery [LOW → NONE]
--- 🔵 system/hostname — get computer host name
│ ≡ execution [LOW]
│ 🔵 shell/TERM — Look up or override terminal settings: TERM
│ ≡ filesystem [LOW]
│ 🔵 link_read — read value of a symbolic link: readlink
+++ 🔵 directory/traverse — traverse filesystem hierarchy: _fts_children, _fts_close, _fts_open, _fts_read, _fts_set
│ X networking [LOW → NONE]
--- 🔵 url/embedded — contains embedded HTTPS URLs
│ X process [MEDIUM → NONE]
--- 🟡 name_set — get or set the current process name
│
Do we want to go this route? Two files of different types with the same name (sans extension) would show odd behavior drift but maybe we want that? Conversely, two files with different names may be the same underlying file and the diff would show as deleted/added when it would be more appropriate to show as changed.
FWIW, like diff(1), the file name shouldn't matter at all if two filenames are being passed into it. Filename logic only happens when directories are passed in.
I mentioned relative paths in the initial bug report, but this happens for absolute paths too:
go run ./cmd/mal/ diff ../malcontent-samples/linux/clean/ls.x86_64 /bin/ls
FWIW, like diff(1), the file name shouldn't matter at all if two filenames are being passed into it. Filename logic only happens when directories are passed in.
Ah okay, this clarifies things. I'll hack on that tomorrow.
Addressed via #628.
I think we still have our logic a bit off if you are passing two filenames as arguments:
similar to the
diff(1)
command, this should show up as a single changed file.