apache / incubator-heron

Apache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter
https://heron.apache.org/
Apache License 2.0
3.65k stars 597 forks source link

Developing - debugging topologies #844

Closed samek closed 8 years ago

samek commented 8 years ago

Hi, I'm having trouble debugging topology that I'm writing since it doesn't log/breakpoint beyond the second bolt.

I've used my old storm topology project in which I've changed the pom.xml (http://twitter.github.io/heron/docs/upgrade-storm-to-heron/)

I'm using simulator mode. So If I run it in the InteliJ when topology is runned I can't see anything happening beyond second bolt, If I compile the project and run in with heron submit local (still simulator mode) I can see all the output (from other bolts as well).

My question really is: Is there a guide on what needs to be setup in order to develop/debug a new topology. storm-starter for example was great in that way.

kramasamy commented 8 years ago

@maosongfu / @nlu90 - could you help on the simulation mode?

samek commented 8 years ago

@kramasamy it's more of a IDE/Project setup thing.. since If I run same topology with (simulator or normal) heron submit It works like it should. In InteliJ as a previous maven (storm) project I only get operations from spout and first bolt after it nothing more. But you cant develop it like that. (

kramasamy commented 8 years ago

@nlu90 - can you please check and get to the bottom of the issue?

nlu90 commented 8 years ago

Hi @samek, what do you mean by "can't see anything happening beyond second bolt"?

Does it mean your bolt logs/prints some information during the execution but you can't find it in Intellij's Console? Our simple ExclamationTopology example can output log and other prints to IntelliJ's console as follows:

Jun 02, 2016 5:04:53 PM com.twitter.heron.localmode.executors.StreamExecutor handleInstanceExecutor
SEVERE: Nobody consumes stream: id: "default"
component_name: "exclaim2"
Jun 02, 2016 5:04:53 PM com.twitter.heron.localmode.executors.StreamExecutor handleInstanceExecutor
SEVERE: Nobody consumes stream: id: "default"
component_name: "exclaim2"
Jun 02, 2016 5:04:53 PM com.twitter.heron.localmode.executors.StreamExecutor handleInstanceExecutor
SEVERE: Nobody consumes stream: id: "default"
component_name: "exclaim2"
nathan
nathan
bertels

Or you just need a guide on how to setup your IntelliJ for development?

samek commented 8 years ago

@nlu90 I'm writing a new topology and as of now It's like this:

TopologyBuilder builder = new TopologyBuilder();
SpoutConfig config = new SpoutConfig(consumerTopic, bootstrapKafkaServers, "spoutId");
config.scheme = new KeyValueSchemeAsMultiScheme(new ByteArrayKeyValueScheme());

//Get messages from kafka//
builder.setSpout("spout", new KafkaSpout(config), 1);

//Create multiple streams
builder.setBolt("splitter",new kafkaSplitterBolt(),1).shuffleGrouping("spout");

//bind bolts to correct stream//

builder.setBolt("articleRollingWindow", new ArticleRollingWindowBolt(),1).shuffleGrouping("splitter","articleStream");
builder.setBolt("SectionRollingWindow", new SectionRollingWindow(),1).shuffleGrouping("splitter", "sectionStream");

Config conf = new Config();
conf.setDebug(true);

conf.setNumStmgrs(1);
conf.setContainerCpuRequested(0.2f);
conf.setContainerRamRequested(1024L * 1024 * 512);
conf.setContainerDiskRequested(1024L * 1024 * 1024);
simulator.submitTopology("Testtopology", conf, builder.createTopology());

so I've got a kafka spout --> splitter bolt (parses json, creates multiple streams), then I've got 2 bolts each consuming each own stream generated from splitter bolt.

Thing is that If I run it in InteliJ as debug or just run, I don't see any output from bolts which should consume streams from splitter (I've added output in the execute) nor I cannot put breakpoints in them.

But If I leave the simulator mode on, compile the topology (mvn package) then run heron submit local I can see all the outputs from all the bolts.

So It's basically both: I can't see output from bolts beyond the splitter bolt and I guess I need a guide on how to setup the InteliJ development for topologies :(

nlu90 commented 8 years ago

@samek

For Intellij debugging, Bolts and Spouts are running as separate threads in simulator. So if you want to add breakpoints inside a bolt/spout, you need to set the Suspend Policyof the breakpoint to Thread. Right click on your break point and you can see:

image

For the output issue, one thing I can suggest is saving the output to a file and check the file. To set it, chose Run -> Edit Configurations.... and you can see 1 pic

samek commented 8 years ago

Thanks for the detailed explanation. I hope someone else is going to find it useful as well.