systemd / casync

Content-Addressable Data Synchronization Tool
1.49k stars 117 forks source link

Next step for Content Defined Chunking #49

Open erthink opened 7 years ago

erthink commented 7 years ago

Hi, In 2012 I also developed a method of some segmentation. The basic idea is to combine multiple functions-predicates in the moving window, in order to obtain a better size segments. Behind all this there is some math, but rather simple and boring.

Later the intention was to patent some applications of this approach. But I refused and now it is completely free of patents.

Below links to slides, unfortunately only in Russian. But there are enough graphs and formulas, to understand about the idea.

Let me know if you are interested, I will translate materials to English.

The result = https://image.slidesharecdn.com/random-161106145116/95/-18-638.jpg The entire presentation = https://www.slideshare.net/leoyuriev/ss-68259503

UPDATE: Some translation by Google = https://translate.google.com/translate?act=url&ie=UTF8&sl=ru&tl=en&u=http://www.highload.ru/2016/abstracts/2263.html

Regards.

aep commented 7 years ago

@leo-yuriev interesting. Is there any abstract or other summary of the method in english?

ivanbaldo commented 6 years ago

But how does it compare to the method currently implemented by casync? How much better it is? Thanks by the way!!!