GenSync is a framework for efficiently synchronizing similar data across multiple devices.
The framework provides a shared library for benchmarking and optimizing a variety of state-of-the-art data synchronization protocols, either offline or directly embedded within application code. In one typical use-case, an application would use GenSync for its core data synchronization needs, and developers can compare and optimize the performance of different synchronization protocols to suit their needs. Alternatively, users could utilize the GenSync library to profile synchronization usage for their application, and then experiment with synchronization protocols offline to improve perfromance.
The current implementation of the framework includes four families of data synchronization protocols (and their variants)):
Suppose that laptop A
has a list of contacts:
Alice, Jim, Jane, Rick, Bob
and cellphone B
has the contacts:
Alice, Jim, Suzie, Jane, Rick
Then an efficient synchronization protocol might quickly idenfity the differences (Rick
on the laptop, and Suzie
on the cellphone)
and exchange only these contacts, rather than sending the entire contact list from one device to another.
The source code for this library is divided among several repositories.
Each of these protocols is implemented as a peer-to-peer protocol. For purposes of explanation, one peer is called a client and the other a server .
If you use this software, please cite at least the following paper (pdf, DOI):
@article{bovskov2022gensync,
title={Gensync: A new framework for benchmarking and optimizing reconciliation of data},
author={Bo{\v{s}}kov, Novak and Trachtenberg, Ari and Starobinski, David},
journal={IEEE Transactions on Network and Service Management},
volume={19},
number={4},
pages={4408--4423},
year={2022},
publisher={IEEE}
}
The following works are also significant to this software implementation:
[MTZ03] Y. Minsky, A. Trachtenberg, and R. Zippel, "Set Reconciliation with Nearly Optimal Communication Complexity", IEEE Transactions on Information Theory, 49:9. http://ipsit.bu.edu/documents/ieee-it3-web.pdf
[MT02] Y. Minsky and A. Trachtenberg, "Scalable set reconciliation" 40th Annual Allerton Conference on Communication, Control, and Computing, 2002. http://ipsit.bu.edu/documents/BUTR2002-01.pdf
[DTA03] D. Starobinski, A. Trachtenberg and S. Agarwal, "Efficient PDA synchronization" IEEE Transactions on Mobile Computing 2:1, pp. 40-51 (2003). http://ipsit.bu.edu/documents/efficient_pda_web.pdf
[SCT06] S. Agarwal, V. Chauhan and A. Trachtenberg, "Bandwidth efficient string reconciliation using puzzles" IEEE Transactions on Parallel and Distributed Systems 17:11,pp. 1217-1225 (2006). http://ipsit.bu.edu/documents/puzzles_journal.pdf
[KLT03] M.G. Karpovsky, L.B. Levitin. and A. Trachtenberg, "Data verification and reconciliation with generalized error-control codes" IEEE Transactions on Information Theory 49:7, pp. 1788-1793 (2003).
[EGUV11] D. Eppstein, M.T. Goodrich, F. Uyeda, and G. Varghese. "What's the difference?: efficient set reconciliation without prior context." ACM SIGCOMM Computer Communication Review 41.4 (2011): 218-229.
[GM11] M.T. Goodrich and M. Mitzenmacher. "Invertible bloom lookup tables." 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2011.
[MM18] M. Mitzenmacher and T. Morgan. "Reconciling graphs and sets of sets." Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. ACM, 2018.
More at https://people.bu.edu/trachten.
Elements of the GenSync project code have been worked on, at various points, by: