apache / incubator-heron

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
https://heron.apache.org/
Apache License 2.0
3.65k stars 597 forks source link

Heron Instance should run user code in a seperate classloader #1735

Open billonahill opened 7 years ago

billonahill commented 7 years ago

Heron Instance (Java) currently includes the bolt/spout deps in it's classpath when starting and both the Heron code and the instance code share the same classloader. We should separate the two. This would reduce the need for shading and provide better isolation between instance deps and heron deps.

billonahill commented 7 years ago

Sangjin did something similar for Hadoop. Basically, all user classes are loaded from a user classpath, which includes the user jars/dirs. If classes aren't found, there is a whitelist of system-level classes that will be searched. This list should be as short as possible and it typically includes the framework classes (e.g., Heron API) and some very low level core classes like javax and logging. This is similar to what servlet containers do for user classpath isolation, while providing access to framework classes.

Note that this class loader logic is reversed from the default classloader logic, which is to first check the parent loader, then check the current loader.

This is the trunk version of ApplicationClassLoader: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ApplicationClassLoader.java

The default system classes definition: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/resources/org.apache.hadoop.application-classloader.properties

billonahill commented 7 years ago

Example usage: https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java