Open ValarDragon opened 6 years ago
@ValarDragon thank you for filing this issue! This thought once struck me about 2.5 months ago but I forgot to file an issue. For anyone looking at this issue here is a repro to show that go-amino is still susceptible to cyclic references
package main
import (
"fmt"
"log"
"github.com/tendermint/go-amino"
)
type Node struct {
Name string `json:"name"`
Data []byte `json:"data"`
Left *Node `json:"left"`
Right *Node `json:"right"`
}
func main() {
f := &Node{Name: "here"}
f.Left = f
cdc := amino.NewCodec()
cdc.RegisterConcrete(new(Node), "node", nil)
blob, err := cdc.MarshalJSON(f)
if err != nil {
log.Fatalf("Failed to marshal: %v", err)
}
fmt.Printf("blob: %s\n", blob)
}
this will blow up the stack as it would with the stdlib https://play.golang.org/p/CTd76035y3y or inlined below:
package main
import (
"encoding/json"
"fmt"
"log"
)
type Node struct {
Name string `json:"name"`
Data []byte `json:"data"`
Left *Node `json:"left"`
Right *Node `json:"right"`
}
func main() {
f := &Node{Name: "here"}
f.Left = f
blob, err := json.Marshal(f)
if err != nil {
log.Fatalf("Failed to marshal: %v", err)
}
fmt.Printf("blob: %s\n", blob)
}
One of the things that we can do is to have say a mode called safe
in which we have a map of already traversed/seen data and if we re-encounter a reference, just look up its serialized form. Perhaps this problem boils down to serializing a graph with cycles in it :) Another mitigation would be to make an analyzer that traverses datastructures looking for infinite cycles before even attempting to serialize anything. There are traditionally well known methods for detecting cycles that we can use.
Obviously this affects all data structures with references to "same datatype" elements e.g. trees, singly linked lists, block chains. In the mean-time, it'd be nice to audit tree usages to ensure that somehow we aren't "accidentally" self referencing.
I once solved the issue for my needs, and I then published the result as a library. Maybe it can be used in this case: https://github.com/go-extras/tahwil
Its a normal problem with JSON that it can't handle cyclic data structures. In the case of golang's MarshalJSON, it will go into an infinite recursion loop. Source https://golang.org/pkg/encoding/json/.
I didn't see any mention of this in our source code. I don't think going into an infinite loop is an acceptable situation for our use case. (Otherwise we would have to track down any potential reference cycles in all downstream usecases, and ensure that an adversary can never produce one)
I think it would be a good idea for us to wrap our current unmarshal JSON, and do one of two things: set a maximum recursion depth and error when that is exceeded, or start using a cycle detection algorithm after a certain recursion depth (say 10), such as https://en.wikipedia.org/wiki/Cycle_detection#Floyd's_Tortoise_and_Hare. (I imagine maximum recursion depth would be easier / more light computation wise, though I think the latter is cooler)