Closed raghavi222 closed 11 years ago
Please see the debugging guide in the RHadoop wiki. Unfortunately if you run your first program in distributed mode and just send me the console output, there isn't a lot for me to work on there. A screenshot has the only advantage of being harder to read, I hope people stop doing that. There's nothing I can do about it, there are multiple logs that are local to the machines that are executing the different processes. That's the way Hadoop is and there's probably good reasons why but whether or not, we need to work with it. Which means you either switch to standalone mode (see Cloudera documentation for how to do that) or go and fetch the stderr logs from the web UI. And while in this case there is nothing wrong with your program (works for me), these are skills that you need anyway to debug your own programs, so learning your way around the different modes and logs is a necessary investment.
hey thanks.. i could resolve my error... i've one more dout.. i've installed flume .. entered all my twitter credentials into flume.conf file my flume.conf file is
TwitterAgent.sources = Twitter TwitterAgent.channels = MemChannel TwitterAgent.sinks = HDFS
TwitterAgent.sources.Twitter.type = com.cloudera.flume.source.TwitterSource
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sources.Twitter.consumerKey =
TwitterAgent.sinks.HDFS.channel = MemChannel TwitterAgent.sinks.HDFS.type = hdfs TwitterAgent.sinks.HDFS.hdfs.path = hdfs://localhost:50070/user/flume/tweets/%Y/%m/%d/%H/ TwitterAgent.sinks.HDFS.hdfs.fileType = DataStream TwitterAgent.sinks.HDFS.hdfs.writeFormat = Text TwitterAgent.sinks.HDFS.hdfs.batchSize = 1000 TwitterAgent.sinks.HDFS.hdfs.rollSize = 0 TwitterAgent.sinks.HDFS.hdfs.rollCount = 10000
TwitterAgent.channels.MemChannel.type = memory TwitterAgent.channels.MemChannel.capacity = 10000 TwitterAgent.channels.MemChannel.transactionCapacity = 100
and i have entered FLUME_CLASSPATH="/usr/lib/hadoop/flume-sources-1.0-SNAPSHOT.jar" in flume-env.sh
when i try to start flume by /etc/init.d/flume-ng-agent start command i'm not recieving any tweets in my hdfs..
my flume.log file is
ERROR lifecycleSupervisor-1-6 - Unable to start EventDrivenSourceRunner: { source:com.cloudera.flume.source.TwitterSource{name:Twitter,state:IDLE} } - Exception follows. java.lang.IllegalStateException: consumer key/secret pair already set. at twitter4j.TwitterBaseImpl.setOAuthConsumer(TwitterBaseImpl.java:261) at com.cloudera.flume.source.TwitterSource.start(TwitterSource.java:129) at org.apache.flume.source.EventDrivenSourceRunner.start(EventDrivenSourceRunner.java:44) at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) 13 Jul 2013 08:42:24,127 INFO agent-shutdown-hook - Stopping lifecycle supervisor 9 13 Jul 2013 08:42:24,130 INFO agent-shutdown-hook - Component type: SINK, name: HDFS stopped 13 Jul 2013 08:42:24,130 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.start.time == 1373729312168 13 Jul 2013 08:42:24,130 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.stop.time == 1373730144130 13 Jul 2013 08:42:24,130 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.batch.complete == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.batch.empty == 106 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.batch.underflow == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.connection.closed.count == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.connection.creation.count == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.connection.failed.count == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.event.drain.attempt == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: SINK, name: HDFS. sink.event.drain.sucess == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Component type: CHANNEL, name: MemChannel stopped 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.start.time == 1373729312155 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.stop.time == 1373730144131 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.capacity == 10000 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.current.size == 0 13 Jul 2013 08:42:24,131 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.attempt == 0 13 Jul 2013 08:42:24,132 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.success == 0 13 Jul 2013 08:42:24,132 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.attempt == 106 13 Jul 2013 08:42:24,132 INFO agent-shutdown-hook - Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.success == 0 13 Jul 2013 08:42:24,132 INFO agent-shutdown-hook - Configuration provider stopping 13 Jul 2013 08:42:29,038 INFO lifecycleSupervisor-1-0 - Configuration provider starting 13 Jul 2013 08:42:29,049 INFO conf-file-poller-0 - Reloading configuration file:/etc/flume-ng/conf/flume.conf 13 Jul 2013 08:42:29,057 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,058 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,058 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,058 INFO conf-file-poller-0 - Added sinks: HDFS Agent: TwitterAgent 13 Jul 2013 08:42:29,058 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,059 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,059 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,059 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,059 INFO conf-file-poller-0 - Processing:HDFS 13 Jul 2013 08:42:29,080 INFO conf-file-poller-0 - Post-validation flume configuration contains configuration for agents: [TwitterAgent] 13 Jul 2013 08:42:29,080 INFO conf-file-poller-0 - Creating channels 13 Jul 2013 08:42:29,082 ERROR conf-file-poller-0 - Unhandled error java.lang.NoSuchMethodError: org.apache.flume.ChannelFactory.getClass(Ljava/lang/String;)Ljava/lang/Class; at org.apache.flume.node.AbstractConfigurationProvider.getOrCreateChannel(AbstractConfigurationProvider.java:236) at org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:199) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:101) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722)
plz help me.. thank q
please direct your question to the appropriate flume forum or issue tracker.
hello sir.. i'm new to R.. i've installed rmr2 package .. but i cud not run the following sample program .. Sys.setenv(HADOOP_HOME="/usr/lib/hadoop") Sys.setenv(HADOOP_CMD="/usr/lib/hadoop/bin/hadoop") library(rmr2) library(rhdfs) small.ints = 1:1000 small.int.path = to.dfs(1:1000) out = mapreduce(input = small.int.path, map = function(k,v) keyval(v, v^2) ) df = as.data.frame( from.dfs( out, structured=T ) )
i've posted the screen shot of my error. i'm using jdk 1.6
HADOOP_HOME=/usr/lib/hadoop HADOOP_CMD=/usr/lib/hadoop/bin/hadoop HADOOP_STREAMING=/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.2.0.jar JAVA_HOME=/usr/java/jdk1.7.0_21 i've set the above environment variables in Renviron and .bashrc file.. i'm using cdh4 hadoop..