Open RobHoman opened 2 years ago
My intuition is that the bug is in the example, not the code. WDYT?
Perhaps I'm missing something, but as it stands I think I stand by my initial intuition.
I'll attempt to elaborate in hopes that this clarifies my interpretation of functionality.
Here is an example program:
package main
import (
"fmt"
"log"
"path/filepath"
"strings"
"github.com/peterbourgon/diskv/v3"
)
func main() {
const (
bbdiskvExtension = ".bbdiskv"
)
kvs := diskv.New(
diskv.Options{
BasePath: "base",
AdvancedTransform: func(key string) *diskv.PathKey {
path := strings.Split(key, "/")
last := len(path) - 1
retval := &diskv.PathKey{
Path: path[:last],
FileName: path[last] + bbdiskvExtension,
}
fmt.Printf("DEBG Converting Key '%s'\n", key)
fmt.Printf("DEBG Returning PathKey '%#v'\n", retval)
return retval
},
In package fmt m: func(pathKey *diskv.PathKey) (key string) {
fmt.Printf("DEBG Inversing PathKey %#v\n", pathKey)
extension := filepath.Ext(pathKey.FileName)
if extension != bbdiskvExtension {
fmt.Printf("DEBG - does not have extension: %#v\n", pathKey)
return ""
}
retval := strings.Join(pathKey.Path, "/") + "/" + pathKey.FileName[:len(pathKey.FileName)-len(bbdiskvExtension)]
fmt.Printf("DEBG Returning string: %#v\n", retval)
return retval
},
CacheSizeMax: 1024 * 1024,
},
)
type writeArgs struct {
Key string
Value string
}
writes := []writeArgs{
{"a/b", "val-for-a/b"},
}
fmt.Println("Writing keys: ")
for _, write := range writes {
fmt.Printf("- write key '%s' with value '%s'\n", write.Key, write.Value)
err := kvs.Write(write.Key, []byte(write.Value))
if err != nil {
log.Fatalf("failed to write: %v", err)
}
}
fmt.Println("Listing keys: ")
keyC := kvs.Keys(nil)
for key := range keyC {
fmt.Printf("- key '%s'\n", key)
}
}
Here is the output:
Writing keys:
- write key 'a/b' with value 'val-for-a/b'
DEBG Converting Key 'a/b'
DEBG Returning PathKey '&diskv.PathKey{Path:[]string{"a"}, FileName:"b.bbdiskv", originalKey:""}'
Listing keys:
DEBG Inversing PathKey &diskv.PathKey{Path:[]string{}, FileName:".", originalKey:""}
DEBG - does not have extension: &diskv.PathKey{Path:[]string{}, FileName:".", originalKey:""}
DEBG Inversing PathKey &diskv.PathKey{Path:[]string{}, FileName:"a", originalKey:""}
DEBG - does not have extension: &diskv.PathKey{Path:[]string{}, FileName:"a", originalKey:""}
DEBG Inversing PathKey &diskv.PathKey{Path:[]string{"a"}, FileName:"b.bbdiskv", originalKey:""}
DEBG Returning string: "a/b"
- key 'a/b'
By my understanding, the InverseTransform
is being called back to analyze the root directory "."
and also the directory a/
. But these are not legitimate outputs of the AdvancedTransform
function in the first place, and so there is no well defined answer that InverseTransform
can give. Furthermore, I don't see why the InverseTransform
callback should even be consulted by the walker
function if the decision in the following line is to skip the item because info.IsDir()
is true.
Another way of looking at it is that it seems throughout the code that there is an intended invariant that directories on the underlying disk never represent an actual key in the store, and I think that invariant implies that the InverseTransform
callback should not be asked to ever look at such non-keys.
Okay, I see what you mean.
I think the underlying problem here is an error in the original API design, and that design error's interactions with the code in the repo and the example in the README. One aspect/view on that design error is that anything in diskv uses panic to express error conditions, which is 100% wrong, and 100% my mistake. Another way to describe it would be that the various funcs, including InverseTransform, don't return errors. They should. Or, at least, their return values should be able to express an "invalid" state.
I agree that your proposed change would be an improvement. I would also like to see the example in the README updated to not panic. I'm not sure if that would require a change to the function signature.
Hey @peterbourgon - loving this library and thank you for your stewardship!
Would love to have you take a look at the following lines: https://github.com/peterbourgon/diskv/blob/2566386005f64f58f34e1ff32907800a64537e6a/diskv.go#L597-L601
By my reading, it would be preferable not to call
InverseTransform
for directories. In my case, I'm using theAdvancedTransform
and its inverse in the README, and this is tickling the "panic" line because directories do not carry the expected extension.What about switching the logic to this? In other words, pass on the directory BEFORE calling
InverseTransform
.