square / tape

A lightning fast, transactional, file-based FIFO for Android and Java.
https://square.github.io/tape/
Apache License 2.0
2.47k stars 288 forks source link

Batched remove() - Using remove() less frequently than peek() #189

Open roytmana opened 6 years ago

roytmana commented 6 years ago

Hi,

Sorry for positing a question here but I am not sure where else I can ask a question. Is it safe to use remove() less frequently than peek (batch remove for every n peek operations)? I am using your queue to implement persistent blocking queue (my take() operation would block till queue has any data in it) and I also do not need to remove() every taken (peeked) element immediately because I can tolerate replaying some processed but un-removed items during recovery from a failure. So for performance reasons I would prefer not to remove them immediately but every say 100 items - my queue processing happens immediately on new element showing up so it will be very rare to have more than one or two elements on the queue except if consumer can't process them in timely fashion or at all. So in my case I will have one remove() for practically every peek() I did not try to measure what impact batched removes would have but given "rwd" file mode every operation is costly. On my fast machine with SSD drive I was only able to pump 1500 add() operations per second with tiny byte[20] payloads

your answer and your opinion on usefulness of batched removes is very much appreciated Thank you, Alex

roytmana commented 6 years ago

I retract my question. I did not realize that pick()/iterate does not keep its own position in the queue - always start from head which is only advanced by remove. So it looks like in a multi-threaded environment there is no way to read from queue process data and remove of successful except keeping the lock for entire period from starting iteration to end of processing and removal. So if consumer for some reason stuck producers will not be able to put anything on the queue till the consumer released the lock.

I guess in case of a single consumer and multiple producers I still could lock, peek, unlock, process then lock and remove assuming the only one consumer performing any remove operation

JakeWharton commented 6 years ago

I wouldn't hold the lock during processing. I would lock to retrieve items, process the items, and then lock to remove the items. So long as you have only a single consumer this won't cause concurrency issues as producers are only able to append.

On Sun, Apr 15, 2018, 6:01 PM Alex Roytman notifications@github.com wrote:

I retract my question. I did not realize that pick()/iterate does not keep its own position in the queue - always start from head which is only advanced by remove. So it looks like in a multi-threaded environment there is no way to read from queue process data and remove of successful except keeping the lock for entire period from starting iteration to end of processing and removal. So if consumer for some reason stuck producers will not be able to put anything on the queue till the consumer released the lock

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/square/tape/issues/189#issuecomment-381441538, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEEEVNrxh7DOZAKWMtJUjsL-bO8DO_iks5to8M-gaJpZM4TVpR7 .

roytmana commented 6 years ago

Thanks @JakeWharton I have switched to a single consumer for the queue with fork/join multiple parallel processors and then removal from the queue when all processors are done. I am curious what performance to expect on adding small items (20 bytes). For me it is pretty slow - about 1500 items/sec on fast machine with SSD. I can pump date to a remote database faster. I tried to turn antivirus off but does not seem to make much difference. What is your typical insertion rate with small items?

swankjesse commented 6 years ago

Tape wasn't designed to be super high throughout; it was designed to be simple and safe.

If you want to ramp up throughput, encode items in batches. It's a more complicated API but you'll save fsync calls which dominates performance.

roytmana commented 6 years ago

thank you @swankjesse I was thinking about it but adding does not lend itself to batching in my case - the whole point is to put it on a persistent queue immediately for reliability not to accumulate it in memory. As far as reading I do drain whole the queue and then remove all drained items at once but to avoid extra fsync but iteration I think reads records one by one. I do not know if there is anything else I can do beyond bulk removal. I really liked Tape when I found it and would prefer not to roll my own solution although my case is simpler - I just need ability to recover queue on crash/restart and dumping overflow data if consumer is not available. I could do with an in memory queue backed up by append-only fixed record size file with head index that is moved on every N-th peek because I can tolerate replaying already processed transactions. Given all that I think I can make it good deal faster but it'll still have many corner cases to worry about and test...

roytmana commented 6 years ago

sorry, the last question - when 2.0 release is to be expected? I am using snapshot maven but would prefer a release if possible