fogfish / esq

simple persistent queues for Erlang
Apache License 2.0
56 stars 10 forks source link

why there is a need for storing a hash for each message #6

Closed silviucpp closed 6 years ago

silviucpp commented 7 years ago

Hello @fogfish,

Looking to the code I see you are storing a hash for each message when writing to the file and also you check if it's match when reading.

I understand that you are checking the data integrity but what are your concern here for this overhead ?

Silviu

fogfish commented 7 years ago

Hello,

Yes, this is integrity check if the file is corrupted outside of beam process (e.g. by file system). It simply protect a queue from incorrect message.

silviucpp commented 6 years ago

Hello can we make this optional ? I mean for most of use cases this is a pretty big overhead

Silviu

fogfish commented 6 years ago

Can you please elaborate this? Do you refers this as "theoretical" overhead or you are talking about real use-case?

I am using (and have been used) this queue with system(s) that handles large volume of messages (millions per hour). Indeed, you need to provide addition disk space but this has not been at overhead.

Please advise!

fogfish commented 6 years ago

@silviucpp Do you have any feedback on this matter?

silviucpp commented 6 years ago

I was talking about "theoretical" overhead only

fogfish commented 6 years ago

I've benchmarked the queue implementation once again due to this issue. It is able to perform about 70K writes per second on laptop with SSD. I/O number is low on AWS EBS. It is around 20 - 30K writes per second but uid is not a bottleneck what-so-ever. As I've explained to you, it is required for in-flight queue feature and I do not like to fragment API.