Open ssimon opened 10 years ago
This is because the cache is essentially an in-memory database that contains a copy of each document. A stage may write back changes to that document, modifying it in memory, while that same document was already being serialized out to a second stage. This causes a ConcurrentModificationException to be thrown by the serialization code. Simply trying again would solve the issue in most cases.
Essentially, it provides some transactional safety to the document retrieval, guaranteeing that a stage will always get a consistent view of the document it is getting, and not some but not all changes made by a concurrent update. In a traditional pipeline where a document is only eligible to be processed by a single stage at a time, this condition can never happen.
Might make sense to lower the log level of the condition, or implement a shared read-write lock on the document being serialized or updated.
Don't really know when this is happening, but I see this error in the logs from time to time:
2014-01-15 17:21:25,040 [I/O dispatcher 1] WARN com.findwise.hydra.SerializationUtils - A ConcurrentModificationException was caught during serialization. Trying again!
It seems to resolve it self, but warning tend to be bad anyway..