ReactiveCouchbase / ReactiveCouchbase-core

Core library for ReactiveCouchbase
Apache License 2.0
64 stars 28 forks source link

Uncomplete requests for BulkGet under heavy load (and maybe Windows) #5

Closed mathieuancelin closed 9 years ago

mathieuancelin commented 10 years ago

See https://github.com/mathieuancelin/play2-couchbase/issues/33 for background by @kieranbenton

mathieuancelin commented 10 years ago

Ooops my bad, this one is on me.

Please can you try the last snapshot ?

And don't forget to set your couchbase.actorctx.timeout to something like 100000 to avoid failing too soon

kieranbenton commented 10 years ago

That certainly seems to have changed something! I don't think I can reproduce the lockup anymore. I get completely clean logs and all of my tests finish in a fairly reasonable time.

I'll keep trying to break it tonight and let you know.

Does that point at something specific in the driver to get it fixed properly?

mathieuancelin commented 10 years ago

On my side, unfortunately no, it's just a trick to double check the callback mechanism and use the old way (future monitoring) to unlock uncompleted promises that are actually completed.

But maybe @daschl found something on the driver.

kieranbenton commented 10 years ago

Ah ok - I take it that has a negative perf impact given its not actually working async?

kieranbenton commented 10 years ago

I can reliably reproduce the issue with couchbase.driver.doublecheck=false and then switching to true partially through a 'stalled' run lets the remainder complete perfectly. So I'm fairly confident you're working around it effectively.

mathieuancelin commented 10 years ago

Yes obviously,

the previously blocked request will be unlock according to the monitoring delay.

If setting couchbase.driver.doublecheck=false enables the issue again then it's clearly an issue in the Java driver callback mechanism.

kieranbenton commented 10 years ago

Hi @daschl, we've still got this issue - is there anything I can do to try and help you identify it? From Mathieu's investigations it seems pretty clear which area at least the issue is in.

Cheers, Kieran

daschl commented 10 years ago

@kieranbenton hey.. jeah sorry about that I'm still a bit clueless because I can't reproduce it. Did you have any luck to repro it outside of your windows env (linux, max?).

kieranbenton commented 10 years ago

No problem - no we can't get it consistently on OS X but we think we are seeing it very rarely. Shall I get you an EC2 windows instance stood up so you can see it happen for yourself?

Cheers.

On 17 February 2014 11:37, Michael Nitschinger notifications@github.comwrote:

@kieranbenton https://github.com/kieranbenton hey.. jeah sorry about that I'm still a bit clueless because I can't reproduce it. Did you have any luck to repro it outside of your windows env (linux, max?).

— Reply to this email directly or view it on GitHubhttps://github.com/ReactiveCouchbase/ReactiveCouchbase-core/issues/5#issuecomment-35249380 .

daschl commented 10 years ago

@kieranbenton let me get in touch with you over PN here so we can figure out the details of access.

oh looks like they removed the feature...

kieranbenton commented 10 years ago

Lol, ok - are you on Google Hangouts?

daschl commented 10 years ago

I'll get in touch with you through the email addr provided on the github page

daschl commented 10 years ago

@kieranbenton yes I am!

daschl commented 10 years ago

@kieranbenton can you try 1.4? I did fix a sync issue in spy http://blog.couchbase.com/couchbase-java-sdk-140-new-and-noteworthy