dgraph-io / dgraph

The high-performance database for modern applications
https://dgraph.io
Other
20.3k stars 1.49k forks source link

[FEATURE]: Introduce Reed-Solomon erasure code algorithm in Dgraph #8784

Closed MichelDiz closed 1 month ago

MichelDiz commented 1 year ago

Use case

Reed-Solomon erasure code is a powerful error correction algorithm that can help improve data reliability and recovery in distributed systems. By introducing this algorithm in Dgraph, we can ensure that data is not lost or corrupted even in the case of node failures, network issues, BitFlip, Burst Error, disk erros or power loss.

It can result in a higher disk space requirement.

Links to Discuss, RFC or previous Issues and PRs

No response

Links to examples and research

There are several open-source projects that use Reed-Solomon erasure code, such as Ceph and Hadoop. It is also a well-known and widely-used algorithm in the data storage and recovery industry.

For windows https://github.com/Yutaka-Sawada/MultiPar

More about the algo https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction

A paper on the algo http://web.eecs.utk.edu/~jplank/plank/papers/CS-05-569.pdf

Reed Solomon Encoding - Computerphile https://www.youtube.com/watch?v=fBRMaEAFLE0

Current state

Dgraph currently does not support Reed-Solomon erasure code, and there is no easy workaround to achieve the same level of reliability and recovery without it.

Solution proposal

Proposal: Introduce Reed-Solomon erasure code as a built-in feature in Dgraph. This could be done by implementing the algorithm in the storage layer(Badger), and providing options for users to configure the level of redundancy and recovery they require.

Benefits:

Additional Information

No response

github-actions[bot] commented 1 month ago

This issue has been stale for 60 days and will be closed automatically in 7 days. Comment to keep it open.