QualiMaster / qm-issues

2 stars 0 forks source link

Priority pipeline on TSI cluster #26

Closed eichelbe closed 7 years ago

eichelbe commented 9 years ago

It seems that HDFS-based sources do not emit data (NullPointerException).

ap0n commented 9 years ago

The problem was caused by the libs that we had pinned on the cluster under /var/nfs/libs. Once they where removed from storm's path, everythink ran smoothly.

eichelbe commented 9 years ago

Unclear: validation managing a pipeline through the QM infrastructure

ap0n commented 9 years ago

Failed with exception java.lang.NoSuchMethodError: com.google.common.cache.CacheBuilder.build()Lcom/google/common/cache/Cache.

I noticed the directory qm-libs/libs. Since I wasn't sure if this directory is also in the storm path, I moved every jar under libs to qm-libs. Tried again but failed again with the same exception. I think is has something to do with guava(?)

eichelbe commented 9 years ago

Guava... hmm. Did you have other parts of the QM infrastructure than StormCommons, DML and QM.Events in the fat jar`before?

ap0n commented 9 years ago

The pom of the priority pipeline only has storm-core, StormCommons, PriorityPipelineInterfaces, DataManagementLayer and QualiMaster.Events; plus the correlation & sentiment analysis jars.

Guava was not the problem after all. I added it to the pom and it didn't work again

eichelbe commented 9 years ago

Ok, no higher levels that would include EASy, which indirectly uses Guava. Then I agree to leave the jar for now as it is. Please sync with Cui so that QM-IConf can create the full pipeline jar. We put the libs of the QM infrastructure into another NFS folder outside the storm classpath and see whether we can start the pipeline via the infrastructure.,

cuiqin commented 9 years ago

The pom generation script for full dependencies is commited.

cuiqin commented 9 years ago

As I tested the PriprityPipeline with full jars, except for the exception of the hardware connection(hardware is offline) it works without other exceptions. Although the sources now produce quite a lot of data, there are not really many going through the entire pipeline, especially some elements are not really receiving data. Shall we verify if in the end the Sink is receiving the expected data?

While testing the pipeline through QM infrastructure, it seems to me the java compiler version in the cluster is changed to 1.6. Actually as we agreed before, we require version 1.7 as some of our code rely on that...

ap0n commented 9 years ago

Java version should be the agreed version as we didn't change it but I'll double check it and let you know.

As for the data we also noticed the problem but the matter requires further investigation. We are on it.

ap0n commented 9 years ago

Java version of the cluster is 1.7

user@snf-618466:~$ java -version
java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
user@snf-618466:~$ javac -version
javac 1.7.0_71

@cuiqin what led you to believe otherwise?

cuiqin commented 9 years ago

From suh user, we get java 1.6... suh@snf-618466:~$ java -version java version "1.6.0_35" OpenJDK Runtime Environment (IcedTea6 1.13.7) (6b35-1.13.7-1ubuntu0.12.04.2) OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)

That's why I run into problem while launching the infrastructure...

ap0n commented 9 years ago

That is strange. I'll look into it.

ap0n commented 9 years ago

It should be ok now. It was a mix-up of the PATH declarations. Can you check it out and let me know?

cuiqin commented 9 years ago

yes, now the java version is 1.7.

cuiqin commented 9 years ago

Validation of the pipeline running through the QM infrastructure: start/stop a pipeline --> no problem switch algorithm needs to be tested again when the entire pipeline process data streams properly as now with a few data going through the pipeline it's hard to see if the switching does its right job.

npav commented 9 years ago

There seems to be a connection-related problem. More specifically, many nodes have a Netty client reconnect issue. The log files contain info like this:

2015-06-11T15:21:18.208+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf-618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [89] 2015-06-11T15:21:18.731+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf-618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [90] 2015-06-11T15:21:19.256+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf-618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [91] 2015-06-11T15:21:19.783+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf-618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [92] 2015-06-11T15:21:20.311+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf-618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [93] 2015-06-11T15:21:20.841+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf-618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [94]

and those logs keep going... We have been trying to find a solution (involving manipulating cluster directories and cleaning zookeeper stuff up) whole day, but with no luck.

The only related topics we found are:

https://issues.apache.org/jira/browse/STORM-404 http://qnalist.com/questions/5058971/storm-workers-not-starting-because-of-netty-reconnect-info-reconnect-started-for-netty-client

Any suggestions?

cuiqin commented 9 years ago

That sounds like a cascading failure problem. I read something about the storm-bolt-of-death, you could find from: https://github.com/verisign/storm-bolt-of-death as well as the effect explanation: https://github.com/verisign/storm-bolt-of-death/blob/master/README_STORM-0.9.3.md

You may create your subpipeline to simplify the entire pipeline for searching the exact problem?

ap0n commented 9 years ago

It indeed sounds like it but we weren't able to locate the source of the problem. We isolated the TSI components (SpringClientSimulator, Preprocessor, etc.) and they do not cause any problems when running on the cluster.

We also noticed that when using a RecordingTopologyBuilder (at the generated Topology class) we could not set the task parallelism to any value above 1. Therefore, we tested by using the default TopologyBuilder class.

I suggest that every partner tests their components as well on the cluster in order to close this one.

eichelbe commented 9 years ago

Any suggestion to fix the RecordingTopologyBuilder?

cuiqin commented 9 years ago

Regarding the RecordingTopologyBuilder, where do you want to increase the task parallelism? in the main pipeline level or in your subpipelines? RecordingTopologyBuilder extends from TopologyBuilder, in principle it should be able to create the Spout/Bolt with multiple tasks.

ap0n commented 9 years ago

We wanted to increase the task parallelism of the HayashiYoshidaBolt which is part of the software correlation subtopology. We were only able to do that when using the TopologyBuilder class.

cuiqin commented 9 years ago

Is it the problem that your bolt can actually call the setNumTasks method, but it does not behave in a multiple tasks way? I just try to understand the problem;)

ap0n commented 9 years ago

In the sub-topology I set the task parallelism for the bolt to 10 (through topologyBuilder.setBolt method) and the in the (generated) Topology I set config.setMaxTaskParallelism(10) but when deploying the pipeline on the cluster, the bolt only runs a single task.

ap0n commented 9 years ago

However maybe the focus should not be on the RecordingTopologyBuilder (maybe a new issue should be opened for this). I would prioritize the correct execution of the pipeline...

cuiqin commented 9 years ago

true, anyway we will keep an eye on RecordingTopologyBuilder.

eichelbe commented 9 years ago

Ok, let's open a new issue (#40), but also keep in mind (as Cui indicated) that the structure of sub-topologies in contrast to main topologies is actually not known as not explicitly modeled (#41). Anyway, testing with a replacement also includes the danger that an issue might accidentally disappear, so let's keep an eye on #40 as Cui said...

ekateriniioannou commented 9 years ago

@npav , Apostolos: Please write here the exception and a description of what should be tested.

ap0n commented 9 years ago

As @npav explained some days ago:

There seems to be a connection-related problem. More specifically, many nodes have a Netty client >reconnect issue. The log files contain info like this:

2015-06-11T15:21:18.208+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf->618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [89] 2015-06-11T15:21:18.731+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf->618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [90] 2015-06-11T15:21:19.256+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf->618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [91] 2015-06-11T15:21:19.783+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf->618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [92] 2015-06-11T15:21:20.311+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf->618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [93] 2015-06-11T15:21:20.841+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf->618463.vm.okeanos.grnet.gr/83.212.119.16:6703... [94]

and those logs keep going... We have been trying to find a solution (involving manipulating cluster >directories and cleaning zookeeper stuff up) whole day, but with no luck.

The only related topics we found are:

https://issues.apache.org/jira/browse/STORM-404 http://qnalist.com/questions/5058971/storm-workers-not-starting-because-of-netty-reconnect-info->reconnect-started-for-netty-client

Any suggestions?

As I said before, when we ran the TSI components on the cluster these problems didn't come up. So every partner should run their code on the cluster and make sure they don't create such problems.

P.S.: @ekateriniioannou when mentioning Nick please use @npav not @ Nick. You mentioned someone outside Qualimaster in your previous comment (but I edited it so it's ok now)! :)

eichelbe commented 9 years ago

I.e., there is no suspicious bolt for now as Katerina mentioned?

ap0n commented 9 years ago

Not really. It's an assumption that one (or more) bolt dies (silently) for some reason and it causes something like a cascading failure to the rest of the topology.

eichelbe commented 9 years ago

Ok. We will replace all algorithmic components by dummy ones and see whether this happens on the plain pipeline.

cuiqin commented 9 years ago

In order to test whether the generated pipeline causes problem, I have implemented dummy algorithms which do nothing, but keep the right data structure for each component in the PriorityPip. I have let the pipeline run for half hour. The data is passing through all the elements smoothly. So far, I do not see the problem from our side.

cuiqin commented 9 years ago

I have worked a bit more on testing the Priority pipeline and found something interesting. I summarize below what I have experienced. Except for running the pipeline with dummy algorithms, in order to figure out the exact problematic bolt I have also tested separately the financial data processing and twitter data processing branch in the priority pipeline. I tested it respectively by replacing one of branches to dummy implementation and remaining other one to actual implementation. By running the pipeline with full dummy algorithms, I realized after some time, there are some failed tuples (not acknowledged successfully) in the twitter source but the pipeline itself continues running. In the end, I did not see the influence on the data passing through the entire pipeline. Then I thought about the setting topology.message.timeout.secs. As far as I know, this setting is the maximum time given to the topology to fully process a message emitted by the Spout. In our priority pipeline, it has been set to a default value of 100s. As I learned from Storm, this value actually needs to be adjusted based on the complexity of the pipeline in order to ensure messages have enough time to be acknowledged but not too long to lose data. That seems to be the interesting thing needs to think about. But for the moment, in order to exclude the impact from the acknowledgement I have turned the pipeline to a unacknowledged case. With the unacknowledged pipeline, I have then tested with both financial and twitter branches. During the running time, the netty reconnection problem still appears. Except for the netty reconnection, I have seen another exception: java.io.FileNotFoundException: File '/app/storm/supervisor/stormdist/PriorityPip-55-1436198538/stormconf.ser' does not exist That is actually the inconsistent problem as the nimbus tries to reassign the task when the worker dies but the old task was still there. It seems to me that happens while reconnecting netty... This issue is fixed in the new storm version: https://issues.apache.org/jira/browse/STORM-130 Now I also got confused where the problem should be. But it seems to me that it might make sense to update the storm version as they are solving issues/bugs in the new storm version. Our issue might have something to do with it.

cuiqin commented 9 years ago

I have also seen this exception:

2015-07-06T20:21:13.792+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] HDFS path exists: /user/storm/resultSymbols 2015-07-06T20:21:13.793+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Try to create iterator 2015-07-06T20:21:13.960+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] HDFS path exists: /user/storm/resultSymbols 2015-07-06T20:21:13.961+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Iterator created, try to access files 2015-07-06T20:21:13.962+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Files on hdfs can be accessed, OK 2015-07-06T20:21:13.963+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] try to read from HDFS files 2015-07-06T20:21:19.659+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] HDFS path exists: /user/storm/resultSymbols 2015-07-06T20:21:19.660+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Try to create iterator 2015-07-06T20:21:19.792+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] HDFS path exists: /user/storm/resultSymbols 2015-07-06T20:21:19.793+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Iterator created, try to access files 2015-07-06T20:21:19.794+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Files on hdfs can be accessed, OK 2015-07-06T20:21:19.795+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] try to read from HDFS files 2015-07-06T20:21:22.672+0300 b.s.m.n.Client [INFO] Reconnect started for Netty-Client-snf-626254.vm.okeanos.grnet.gr/83.212.119.8:6703... [71] 2015-07-06T20:21:22.809+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Can access twitter files on HDFS 2015-07-06T20:21:30.235+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Can access twitter files on HDFS

@sergejzr you might check if it's an issue.

sergejzr commented 9 years ago

Hi Cui, this is not an error, we just have to change ERROR to INFO, but it will not printin this case.

cuiqin commented 9 years ago

ok.. I overlooked.. I have just seen a NullException in both FinancialMapperBolt and TwitterMapperBolt from the pipeline I submitted yesterday.. java.lang.RuntimeException: java.lang.NullPointerException at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) at backtype.storm.daemon.executor$fn3441$fn3453$fn3500.invoke(executor.clj:748) at backtype.storm.util$async_loop$fn464.invoke(util.clj:463) at clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at eu.qualimaster.algorithms.imp.correlation.softwaresubtopology.MapperBolt.ForwardSymbol(MapperBolt.java:161) at eu.qualimaster.algorithms.imp.correlation.softwaresubtopology.MapperBolt.execute(MapperBolt.java:97) at backtype.storm.daemon.executor$fn3441$tuple_action_fn3443.invoke(executor.clj:633) at backtype.storm.daemon.executor$mk_task_receiver$fn3364.invoke(executor.clj:401) at backtype.storm.disruptor$clojure_handler$reify1447.onEvent(disruptor.clj:58) at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) ... 6 more

@npav @ap0n Could you check if this can be an issue?

ap0n commented 9 years ago

This exception is thrown because the input tuple.getValue(0) is not a IFCorrelationTwitter.IIFCorrelationTwitterAnalyzedStreamInput object as it should be. What do you pass as input?

cuiqin commented 9 years ago

I was actually testing the pipeline with all actual implementations. It seems to me this only happens after a quite long time... I submitted the pipeline yesterday and saw the exception today morning. Just want to mention, is it possible to check somehow or throw that exception?

cuiqin commented 9 years ago

Hmmm, now it also happens with the new pipeline I've submitted today... I have replaced the implementation with dummy algrithms in the financial branch and leave the twitter branch as it is before. Now I get this null exception in the TwitterMapperBolt.. The input of the TwitterCorrelation sub-topology should be the output from the SentimentAnalysis sub-topology. It should be matched...

ap0n commented 9 years ago

Please make sure that the input type is IFCorrelationTwitter.IIFCorrelationTwitterAnalyzedStreamInput. What happens now is that the cast (IFCorrelationTwitter.IIFCorrelationTwitterAnalyzedStreamInput) tuple.get(0) fails and it throws the exception. If I recall correctly, the output type of the SentimentAnalysis sub-topology is not the same as the input type of the TwitterCorrelation sub-topology.

cuiqin commented 9 years ago

The output type of the SentimentAnalysis sub-topology should be IIFSentimentAnalysisAnalyzedStreamOutput.

ap0n commented 9 years ago

Yes, and this output passes through one of your bolts where it gets "transformed" to IFCorrelationTwitter.IIFCorrelationTwitterAnalyzedStreamInput At least that was the case until now.

cuiqin commented 9 years ago

Yes, the generated CorrelationTwitter bolt checks the IIFSentimentAnalysisAnalyzedStreamOutput receiving from the SentimentAnalysis subtopology and then casts it to IIFCorrelationTwitterAnalyzedStreamInput in order to fit the input of the CorrelationTwitter sub-topology. From the test, I remain the Twitter branch as it is before for sure it passes the IIFCorrelationTwitterAnalyzedStreamInput. I thought you were talking about the generated CorrelationTwitter bolt.

cuiqin commented 9 years ago

Actually for the netty reconnection issue we got before, a new Storm version might help. How do you think?

ap0n commented 9 years ago

Hm.. actually the problem wasn't where I was looking. I had changes in my local copy that hadn't committed.. I disabled the hardware connections in order not to throw exceptions while testing and made a few more checks. In problematic situations I just log an error for the moment..

cuiqin commented 9 years ago

The status of the new submitted pipeline, after half hour the pipeline hangs and I have seen many different exceptions which are from Storm..I post some of them below. As this complains from Storm, I highly recommend we should think about to try the new version of Storm.

2015-07-07T12:59:26.298+0300 b.s.d.worker [ERROR] Error on initialization of server mk-worker java.io.FileNotFoundException: File '/app/storm/supervisor/stormdist/PriorityPip-59-1436260937/stormconf.ser' does not exist at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:299) ~[commons-io-2.4.jar:2.4] at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1763) ~[commons-io-2.4.jar:2.4]

2015-07-07T12:59:26.322+0300 b.s.util [ERROR] Halting process: ("Error on initialization") java.lang.RuntimeException: ("Error on initialization") at backtype.storm.util$exit_processBANG.doInvoke(util.clj:325) [storm-core-0.9.3.jar:0.9.3] at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na] at backtype.storm.daemon.worker$fn__3743$mk_worker__3799.doInvoke(worker.clj:354) [storm-core-0.9.3.jar:0.9.3] at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.5.1.jar:na] at backtype.storm.daemon.worker$_main.invoke(worker.clj:461) [storm-core-0.9.3.jar:0.9.3] at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.5.1.jar:na] at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na] at backtype.storm.daemon.worker.main(Unknown Source) [storm-core-0.9.3.jar:0.9.3]

2015-07-07T12:58:13.934+0300 b.s.m.n.StormClientErrorHandler [INFO] Connection failed Netty-Client-snf-618463.vm.okeanos.grnet.gr/83.212.119.16:6703 java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_71] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.7.0_71] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_71] at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[na:1.7.0_71] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[na:1.7.0_71] at org.apache.storm.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64) [storm-core-0.9.3.jar:0.9.3]

2015-07-07T12:20:55.073+0300 b.s.m.n.Client [INFO] connection established to a remote host Netty-Client-localhost/127.0.0.1:6703, [id: 0x52cc6459, /127.0.0.1:51253 => localhost/127.0.0.1:6703] 2015-07-07T12:20:55.076+0300 b.s.util [ERROR] Async loop died! java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, and does not take requests any more at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) ~[storm-core-0.9.3.jar:0.9.3] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) ~[storm-core-0.9.3.jar:0.9.3] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) ~[storm-core-0.9.3.jar:0.9.3] at backtype.storm.disruptor$consume_loopSTAR$fn__1460.invoke(disruptor.clj:94) ~[storm-core-0.9.3.jar:0.9.3] at backtype.storm.util$async_loop$fn__464.invoke(util.clj:463) ~[storm-core-0.9.3.jar:0.9.3] at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] Caused by: java.lang.RuntimeException: Client is being closed, and does not take requests any more at backtype.storm.messaging.netty.Client.send(Client.java:185) ~[storm-core-0.9.3.jar:0.9.3] at backtype.storm.utils.TransferDrainer.send(TransferDrainer.java:54) ~[storm-core-0.9.3.jar:0.9.3]

cuiqin commented 9 years ago

sometimes the access to HDFS files is failed, please also be aware of that.

2015-07-07T13:41:07.461+0300 e.q.d.s.f.HDFSFileTweetSourceReader [ERROR] Can not access HDFS files 2015-07-07T13:41:07.465+0300 STDIO [ERROR] java.io.IOException: Filesystem closed 2015-07-07T13:41:07.466+0300 STDIO [ERROR] at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:707) 2015-07-07T13:41:07.466+0300 STDIO [ERROR] at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:776) 2015-07-07T13:41:07.466+0300 STDIO [ERROR] at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:837) 2015-07-07T13:41:07.467+0300 STDIO [ERROR] at java.io.DataInputStream.read(DataInputStream.java:149) 2015-07-07T13:41:07.467+0300 STDIO [ERROR] at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) 2015-07-07T13:41:07.467+0300 STDIO [ERROR] at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) 2015-07-07T13:41:07.468+0300 STDIO [ERROR] at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) 2015-07-07T13:41:07.468+0300 STDIO [ERROR] at java.io.InputStreamReader.read(InputStreamReader.java:184) 2015-07-07T13:41:07.468+0300 STDIO [ERROR] at java.io.BufferedReader.fill(BufferedReader.java:154) 2015-07-07T13:41:07.468+0300 STDIO [ERROR] at java.io.BufferedReader.readLine(BufferedReader.java:317) 2015-07-07T13:41:07.468+0300 STDIO [ERROR] at java.io.BufferedReader.readLine(BufferedReader.java:382) 2015-07-07T13:41:07.469+0300 STDIO [ERROR] at eu.qualimaster.stream.simulation.inf.TweetFilesystemReader.readTweetFile(TweetFilesystemReader.java:143) 2015-07-07T13:41:07.469+0300 STDIO [ERROR] at eu.qualimaster.data.stream.fs.HDFSFileTweetSourceReader.listenToStream(HDFSFileTweetSourceReader.java:139) 2015-07-07T13:41:07.469+0300 STDIO [ERROR] at eu.qualimaster.stream.simulation.inf.TweetFilesystemReader.run(TweetFilesystemReader.java:34) 2015-07-07T13:41:07.469+0300 STDIO [ERROR] at eu.qualimaster.data.stream.fs.HDFSFileTweetSourceReader.run(HDFSFileTweetSourceReader.java)

ekateriniioannou commented 9 years ago

TSI updates the version of Storm in the cluster inthe next few days

ekateriniioannou commented 9 years ago

no: SUH updates ans test on their cluster first and then TSI :)

cuiqin commented 9 years ago

I have updated the storm version to 0.9.5 in our cluster and also tested the PriorityPip with the dummy algorithms. The pipeline runs smoothly. No conflict with this version.