moinakg / pcompress

A Parallelized Data Deduplication and Compression utility
http://moinakg.github.com/pcompress/
GNU Lesser General Public License v3.0
277 stars 34 forks source link

Deadlock in Deduplication code due to missed semaphore signaling. #15

Closed moinakg closed 10 years ago

moinakg commented 10 years ago

First reported by Matt Mahoney on Encode.ru forum. See:

http://encode.ru/threads/1639-pcompress-a-deduplication-compression-utility?p=36167&viewfull=1#post36167 http://encode.ru/threads/1639-pcompress-a-deduplication-compression-utility?p=36179&viewfull=1#post36179

The following code has a problem: if (*size < ctx->rabin_poly_avg_block_size) return (0);

This is an early exit from deduplication processing if the buffer is less than a dedup block's worth. In case of Global Dedupe index access is serialized by sequential semaphore signaling. So a thread has to signal the next thread's semaphore to allow it access the index. If a signaling is missed due to an early exit, as in the above code, then the next thread waits indefinitely and other threads subsequently get blocked at various points.

So the fix is: if (_size < ctx->rabin_poly_avg_blocksize) { / * Must ensure that we are signaling the index semaphores before skipping * in order to maintain proper sequencing and avoid deadlocks. */ if (ctx->arc) { sem_wait(ctx->index_sem); sem_post(ctx->index_sem_next); } return (0); }