Open billonahill opened 7 years ago
@billonahill I thought we already can get both jstack/jmap information from heron-shell. Could you please explicitly specify the capability that you need?
For aurora scheduler, it seems we have that port already: https://github.com/twitter/heron/blob/master/heron/config/src/yaml/conf/aurora/heron.aurora#L31
This is very inconsistent behaviour. Why do we have things for one scheduler and not for other?
@srkukarni some schedulers might support different features than others w.r.t. the ability to support profiling, heap/thread dumping, etc. Ideally they would all support these things but we don't require that they all do currently.
That's a separate issue though than what I was referring to. I want to be able to step through code in a debugger in my IDE, which means being able to insert JVM args like -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005
into the java command for example.
Here's how I've manually hacked into the heron_executor.py
_get_java_instance_cmd
method (in my working branch) to make a java instance start with these args:
https://github.com/twitter/heron/pull/1807/files#diff-2a669e084143b86b5766a49245f8e357
'-XX:+UseConcMarkSweepGC',
'-XX:ParallelGCThreads=4',
- '-Xloggc:log-files/gc.%s.log' % instance_id]
+ '-Xloggc:log-files/gc.%s.log' % instance_id.replace("$", "")]
+ if global_task_id == 3: # Used to enable debugging of a specific instance
+ instance_cmd =\
+ instance_cmd + ["-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005"]
+
instance_cmd = instance_cmd + self.instance_jvm_opts.split()
if component_name in self.component_jvm_opts:
instance_cmd = instance_cmd + self.component_jvm_opts[component_name].split()
Being able to pass an argument or probably to set an environment variable when submitting a topology to have this effect would be really helpful when developing/debugging in localmode.
But don;'t we already have ways to add more parameters to the jvm? TOPOLOGY_WORKER_CHILDOPTS and TOPOLOGY_COMPONENT_JVMOPTS
@srkukarni I suspect we could inject the params with those settings, but we'd also need to support injecting them only on the specific instance that we want to debug, probably by instance id. We wouldn't want them to all start with debug options since a.) the user would then need to connect to each one to allow them to start; and b.) we'd have port collisions.
Also, this ticket captures adding debug support to the submitter, runtime manager, and the scheduler-as-a-library as well.
You are right that we don't have any way to pass instance specific command line params, but by defn the instance id/task id not something known a priori and thus this cannot be part of api per se. Having said that isn't the purpose of heron-shell to do these kind of profiler/debugger attachments? I also miss what you mean by 'adding debug support to the submitter, runtime manager, and the scheduler-as-a-library'. A little more specific use would help me understand. Thanks!
To attach a remote debugger (i.e., and IDE) to a java process for debugging, you must start the JVM with specific debug options. Hence the heron shell is not applicable here, we must actually start the instance processes with arguments injected.
When a topology is submitted or updated, a number of java processes get started: SubmitterMain
, RuntimeManagerMain
and SchedulerMain
to deploy/update the topology and then the actual HeronInstances
that make up the topology. While developing or debugging locally, it's often convenient to attach a debugger to any of these java processes.
That's what this ticket is for, which is making it easy to do so without having to modify source code to inject the arguments. For example to attach a debugger to SchedulerMain
running locally currently you need to modify the code in schedulerCommand.schedulerCommand(..)
. It would be great if instead I could set a local env variable that the Heron CLI picks up to make this happen.
There are two kinds of people. 1) Heron Users, that is topology writers. For them the current TOPOLOGY_COMPONENT_JVMOPTS approach should work right? 2) Heron Developers, that is people working to enhance Heron. Adding any kind of debugger for components like SubmitterMain, RuntimeManagerMain/SchedulerMain are aimed for this group. Am I to understand that this is ticket is mainly aimed at the second set?
That is correct. This is primarily for heron developers, like us. :)
I am thinking from actual use scenarios:
In most cases, people don't want to re-build and re-deploy a topology just to pick up those debugging jvm options; it would be awesome if we can restart a particular heron-instance with new jvm options at runtime.
So how about allowing heron-shell to accept a request to restart a particular heron-instance with new jvm options at runtime. (It can bring security concerns though)
@maosongfu I haven't yet had the need to restart a running instance to add debug params. Instead I typically want to restart my local topology with debug flag enabled. Restarting and instance of a running topology also means you can't debug what happens the first time it's launched as part of a newly submitted topology.
I suggest we support for sort of passed debug arg or env variable at submit time that can do this.
@billonahill sounds good.
When developing in Heron I often want to connect a java debugger to the process. We should make it easy to enable the debug port for both the submitter, the runtime manager and the instance processes. Current approach is to modify the various shell commands that start these processes to include java debug options.