objectbox / objectbox-java

Android Database - first and fast, lightweight on-device vector database
https://objectbox.io
Apache License 2.0
4.41k stars 302 forks source link

Multiple emissions from a data subscription when one is expected, after quickly recreating box store #695

Open jsoberg opened 5 years ago

jsoberg commented 5 years ago

Issue Basics

Reproducing the bug

Description

While running multiple JUnit tests for an ObjectBox store (using base code copied from https://docs.objectbox.io/android/android-local-unit-tests#create-a-local-unit-test-class), I noticed a strange issue where if I was putting an initial object into the box before every test, I would sometimes be receiving more emissions than I had expected. It turns out that after the first test is run and the base test code (referenced above) is run to close/delete the BoxStore and then create it again (in the @After and @Before annotated methods), subsequent tests would be receiving two emissions (even though only one is expected).

After doing some digging, I thought that this might possibly be coming from the BoxStore's internal thread pool (similar to #616). I added code to essentially "complete" the thread pool before moving onto a new test, by submitting an empty runnable and waiting for it to complete, and that appears to have stopped the described multiple emission issue. That being said I'm not quite sure how this is happening, since the store is being completely recreated (and seemingly shutting down its internal thread pool), so I'm not sure how an emission from a previous publisher would be coming through to the next observer.

Code

Example tests can be found/run from this repository: https://github.com/jsoberg/Objectbox-Java-MultipleEmissionBug. Issue is observed when running all tests in the UnexpectedMultipleEmissionBugTest test class. When all tests are run in this class at once, test1() will execute as expected, while test2() (performing the exact same actions) will fail, as it is asserting a single value but ends up getting 2. If each test method is run individually, they will pass as expected.

Misc

Workaround was implemented (https://github.com/jsoberg/Objectbox-Java-MultipleEmissionBug) in test class ExpectedSingleEmissionTest. In this tests testSetup() method, I submitted an empty runnable and waited for it to complete after pushing my initial entity.

jsoberg commented 5 years ago

Updated test repository to reflect latest ObjectBox version (2.3.4).

greenrobot-team commented 5 years ago

Thanks, this is also what I have observed with our internal integration tests (hinted at in #616).

I'm also not sure of the cause due to store.close() technically waiting up to 1 sec for the thread pool to terminate. My solution boils down to not doing any box operations in test setup/tear down methods and only in the test methods themselves.

If anybody has insights regarding to how JUnit or RxJava might cause this, they are welcome.

-Uwe

MrSlateZB commented 5 years ago

I did some debugging on this issue and it appears that this is more of a simple race condition that just happens to be triggered by this order of events.

Essentially whats happening is, when any transaction is committed to ObjectBox, the ObjectClassPublisher's publish method is called with the array of entity types affected. Next, this array of entity types is passed to the internal thread pool where the list of object class observers are retrieved and then notified of the change.

In the first unit test, the query subscription happens to occur AFTER the publish runnable occurs on the threadpool. In the case of the second unit test, however, the query subscrption occurs BEFORE the publish runnable occurs on the threadpool.

This log of events may help to clarify

Start Test1 ObjectClassPublisher.publish([1]) OnThreadPool: Found no observers to notify for type: 1 Subscribe called for type: 1

Start Test2 ObjectClassPublisher.publish([1]) Subscribe called for type: 1 OnThreadPool: Found 1 observers to notify for type: 1

I think the general problem here is that when ObjectClassPublisher.publish is called, the active observers aren't acquired until much later (on the thread pool). I think it makes more sense for the observers to be obtained at the time of publish() being called and push them into the queue to be processed on the thread pool. This will ensure that only observers active at the time of the change will receive the data.

The only potential negative is that this may result in situations where unsubscribed observers receive data, but I think is acceptable as the change did occur prior to the unsubscribe call. If others disagree, its also possible to compare the queued observers against the active list on the thread pool as a last ditch effort.

MrSlateZB commented 4 years ago

It's been a while since this issue has been updated. Should I post a pull request for my fix or is this considered a non-issue?

greenrobot-team commented 4 years ago

Sorry, this got drowned among other things. Thanks for your additional details.

I'd say the most important thing is that once an observer is unsubscribed it should no longer receive changes. E.g. I'm thinking about an Android app that receives an event after the UI is no longer allowed to be modified. The library user should not have to add an additional check if they correctly unsubscribed in e.g. a lifecycle method.

Regarding whether observers should be considered for notification if they were added in between the call to publish() and the actual call to observer.onData I'm not sure. How do other solutions handle this (e.g. Android's LiveData or Rx)?

MrSlateZB commented 4 years ago

I think that an unsubscribe stopping callbacks is a noble goal, but unfortunately even the current implementation doesn't make that guarantee. Its entirely possible for the following events:

1) ObjectClassPublisherChangeThread - Obtain observers for entity type 2) MainThread - Unsubscribe from the observers 3) ObjectClassPublisherChangeThread - Emits through the observer list with the removed observer still in it.

You could use a lock to prevent unsubscription while emitting a change to observers but I'm not sure you want to expose users to that kind of deadlock potential / performance penalty. In the case of Android, you'd be locking up the users main thread if that's where they unsubscribe from.

In RxJava, there really is only a "best effort" done to prevent emissions after a dispose() call. I think in their case, they are primarily concerned with performance and any solutions like what I proposed earlier would not work for them. The reason you don't often find bugs in Android with Rx is because the observeOn operator checks isDisposed on the Scheduler's thread (i.e. main thread) before emitting downstream. So, as long as the dispose() was called on the main thread the emission is stopped within that operator. I believe this is similar to the existing approach in ObjectBox when specifying a scheduler in the SubscriptionBuilder.

The primary reason this is a problem for me is testing. If I put test data into a box and then register a change observer, our tests become very brittle because I sometimes receive data change notifications on subscribers that I registered after inserting the data. I'm currently working around this by ensuring the change queue is empty in the ObjectClassPublisher before each query subscription I make.

I think the proposed change is low impact because

1) New subscriptions will be guaranteed to only receive data that has changed post subscription 2) Subscriptions that don't use schedulers can still possible receive emissions post unsubscribe (as described above this exists currently). If you want to be extra cautious, the algorithm could be a) Obtain observers in publish and then post to thread pool b) Before emitting in thread pool, obtain list of observers again and remove any observers that were unsubscribed. The key tenant here being don't consider observers that were added in the meantime 3) Subscriptions that use schedulers, will still not receive change notifications if they unsubscribe on the same scheduler.

Sorry for the novel, hopefully that gives some insight to my thinking.

abbasnaqdi commented 2 years ago

This is an important problem, is there any plan to solve this problem quickly?