Netflix / Hystrix

Hystrix is a latency and fault tolerance library designed to isolate points of access to remote systems, services and 3rd party libraries, stop cascading failure and enable resilience in complex distributed systems where failure is inevitable.
24.11k stars 4.7k forks source link

Introduce facility for avoiding sun.misc.Unsafe #734

Closed mattrjacobs closed 8 years ago

mattrjacobs commented 9 years ago

Some platform like Google App Engine prohibit usage of sun.misc.Unsafe. See a report from the Google Group: https://groups.google.com/forum/#!topic/hystrixoss/L55x8-NQNQU.

The work here is to find/introduce an alternative for Unsafe and then some branching code to use it if the platform running Hystrix does not support Unsafe.

mattrjacobs commented 9 years ago

Of note, #779 removed some references to sun.misc.Unsafe. The remaining references are all via HsytrixRollingNumber. I'm currently evaluating some changes around that class that would eliminate sun.misc.Unsafe references entirely.

chrisbianca commented 9 years ago

Hi Matt, do you have any ETA on when the remaining Unsafe references are likely to be removed? I'd love to be able to make use of Hystrix on the GAE project I'm currently working on, having made good use of it on AWS projects in the past.

Thanks, Chris

mattrjacobs commented 9 years ago

@chrisbianca : My plan is to evaluate performance (via jmh) when LongAdder and LongMaxUpdaters are replaced with AtomicLongs. If that performs well, then we can switch Hystrix internals to stop use of LongAdder/LongMaxUpdater, which are the classes which reference Unsafe.

Even if that change goes in, LongAdder and LongMaxUpdater are public classes, so they will need to remain in place. As long as they don't get referenced, would that satisfy the GAE case?

chrisbianca commented 9 years ago

@mattrjacobs : Yes, I believe that GAE will only complain about the presence of a restricted class at the point that it is actually used. If/once you have the changes available on a branch, let me know where and I can confirm that there aren't any issues on GAE if you'd like?

mattrjacobs commented 9 years ago

That sounds like a great idea, @chrisbianca . I'll ping this thread when I get that branch in place and then you can confirm functionality while I do performance testing in the background.

mattrjacobs commented 9 years ago

@chrisbianca I just opened a PR (#791) with this change. I won't merge it until I hear back from you that it actually addresses your problem, and that my performance tests look adequate. I'll try and get those perf tests done in the next week.

chrisbianca commented 9 years ago

@mattrjacobs Sadly, whilst this has removed Unsafe as a stumbling block on GAE, it fails at the next hurdle instead which is the spawning of new Threads:

[INFO] SEVERE: Unhandled exception
[INFO] java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "modifyThreadGroup")
[INFO]  at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372)
[INFO]  at java.security.AccessController.checkPermission(AccessController.java:559)
[INFO]  at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
[INFO]  at com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkPermission(DevAppServerFactory.java:429)
[INFO]  at com.google.appengine.tools.development.DevAppServerFactory$CustomSecurityManager.checkAccess(DevAppServerFactory.java:454)
[INFO]  at java.lang.ThreadGroup.checkAccess(ThreadGroup.java:315)
[INFO]  at java.lang.Thread.init(Thread.java:391)
[INFO]  at java.lang.Thread.init(Thread.java:349)
[INFO]  at java.lang.Thread.<init>(Thread.java:548)
[INFO]  at com.netflix.hystrix.util.HystrixTimer$ScheduledExecutor$1.newThread(HystrixTimer.java:155)

GAE restricts the spawning of new threads other than through it's ThreadManager (https://cloud.google.com/appengine/docs/java/javadoc/com/google/appengine/api/ThreadManager).

To work around this, the creation of new threads in Hystrix would have to make use of a ThreadFactory instead of creating new Threads directly. Do you think this might be possible?

There's a good example in Guava of how they've worked around the limitation on GAE: http://docs.guava-libraries.googlecode.com/git/javadoc/src-html/com/google/common/util/concurrent/MoreExecutors.html#line.766

chrisbianca commented 9 years ago

@mattrjacobs : I had a bit of a play round with this myself. Unfortunately whilst using the ThreadFactory does alleviate the error, it means that you're not able to name the threads as they are currently: Thread.setName() is also a restricted GAE class.

Is there a reason the threads have to be named hystrix-XXX?

mattrjacobs commented 9 years ago

@chrisbianca I'm going to start a new Github Issue for the thread creation issue

chrisbianca commented 9 years ago

@mattrjacobs Did this part of the GAE fix make it into the latest release?

mattrjacobs commented 9 years ago

No, not yet. This is still on the roadmap, but I haven't had time to get to it yet. If you can put together an implementation, I'm happy to review it.

mattrjacobs commented 8 years ago

I believe this issue should be resolved, as of v1.5.0-rc.1. If someone with GAE experience could verify, that would be helpful.

/cc @chrisbianca