celestiaorg / rsmt2d

Go implementation of two dimensional Reed-Solomon merkle tree data availability scheme.
Apache License 2.0
162 stars 81 forks source link

Decode returns an error when failed to reconstruct #200

Open rootulp opened 1 year ago

rootulp commented 1 year ago

Context

If I understand correctly the Codec interface is meant to serve as a boundary between rsmt2d logic and different erasure coding libraries.

Problem

The leopard codec returns an error when it fails to reconstruct all shards

https://github.com/celestiaorg/rsmt2d/blob/25576208f90d3d171e5cbf7407704c1d61ae6bf1/leopard.go#L52-L53

This leads to janky error handling: https://github.com/celestiaorg/rsmt2d/blob/25576208f90d3d171e5cbf7407704c1d61ae6bf1/extendeddatacrossword.go#L252-L257

because reconstruction is expected to fail while solving the extended data crossword iteratively. Since the error that is propagated is defined in klauspost/reedsolomon, in order to check if the error is of type ErrTooFewShards, the extendeddatacrossword.go has to import an error type directly from klauspost/reedsolomon. Such an import would leak implementation details that should ideally be contained to the codec interface in codecs.go and the implementation of that interface in leopard.go

Proposal

Option A

Define a new error in codecs.go like rsmt2d.ErrTooFewShards. In leopard.go if a reedsolomon.ErrTooFewShards is observed, return a new rsmt2d.ErrTooFewShards instead.

    // Decode attempts to reconstruct the missing shards in data. The `data`
    // parameter should contain all original + parity shards where missing shards
    // should be `nil`. If reconstruction is successful, the original + parity
    // shards are returned. If reconstruction is unsuccessful, an error is returned.
    Decode(data [][]byte) ([][]byte, error)

Option B

In leopard.go if a reedsolomon.ErrTooFewShards is observed don't propagate it. Instead, return all the shards with missing shards still set to nil.

    // Decode attempts to reconstruct the missing shards in data. The `data`
    // parameter should contain all original + parity shards where missing shards
    // should be `nil`. If reconstruction is successful, the original + parity
    // shards are returned. If reconstruction is unsuccessful, the parameter data 
        // is returned unmodified.
    Decode(data [][]byte) ([][]byte, error)
rootulp commented 1 year ago

Leaving this issue open b/c I think the error returned by Decode leaks implementation details from klauspost/reedsolomon.