spring-attic / spring-hadoop-samples

Spring Hadoop Samples
Apache License 2.0
492 stars 466 forks source link

Windows Client Support? #7

Open pooleja opened 10 years ago

pooleja commented 10 years ago

Does Spring Hadoop support running from a Windows client? I assume it does, since I see windows specific batch files to execute in the map reduce example.

When I build and run on a Windows client, connecting to my cluster, it fails. First it says it can't load native libs and then it submits the job but fails after that.

11:40:41,919  INFO t.support.ClassPathXmlApplicationContext: 510 - Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@659297ab: startup date [Tue Feb 11 11:40:41 EST 2014]; root of context hierarchy
11:40:42,176  INFO eans.factory.xml.XmlBeanDefinitionReader: 315 - Loading XML bean definitions from class path resource [META-INF/spring/application-context.xml]
11:40:42,895  INFO ort.PropertySourcesPlaceholderConfigurer: 172 - Loading properties file from class path resource [hadoop.properties]
11:40:42,922  INFO ctory.support.DefaultListableBeanFactory: 596 - Pre-instantiating singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@74ab6b5: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,runner]; root of factory hierarchy
11:40:43,166  INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
11:40:44,706 ERROR             org.apache.hadoop.util.Shell: 303 - Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
    at org.apache.hadoop.conf.Configuration.getTrimmedStrings(Configuration.java:1546)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:519)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2433)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:166)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:351)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:287)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.addInputPath(FileInputFormat.java:466)
    at org.springframework.data.hadoop.mapreduce.JobFactoryBean.afterPropertiesSet(JobFactoryBean.java:208)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1547)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1485)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:524)
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:461)
    at org.springframework.beans.factory.support.AbstractBeanFactory$1.getObject(AbstractBeanFactory.java:295)
    at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:223)
    at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:292)
    at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:194)
    at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:608)
    at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:932)
    at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:479)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:197)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:172)
    at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:158)
    at org.springframework.samples.hadoop.mapreduce.Wordcount.main(Wordcount.java:28)
11:40:45,142  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:45,245  INFO ramework.data.hadoop.mapreduce.JobRunner: 192 - Starting job [wordcountJob]
11:40:45,302  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:45,971  WARN org.apache.hadoop.mapreduce.JobSubmitter: 258 - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
11:40:46,080  INFO doop.mapreduce.lib.input.FileInputFormat: 287 - Total input paths to process : 1
11:40:46,422  INFO org.apache.hadoop.mapreduce.JobSubmitter: 394 - number of splits:1
11:40:46,441  INFO he.hadoop.conf.Configuration.deprecation: 840 - user.name is deprecated. Instead, use mapreduce.job.user.name
11:40:46,442  INFO he.hadoop.conf.Configuration.deprecation: 840 - fs.default.name is deprecated. Instead, use fs.defaultFS
11:40:46,444  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
11:40:46,444  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.name is deprecated. Instead, use mapreduce.job.name
11:40:46,450  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
11:40:46,451  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
11:40:46,451  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
11:40:46,452  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
11:40:46,452  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
11:40:46,454  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
11:40:46,454  INFO he.hadoop.conf.Configuration.deprecation: 840 - mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
11:40:46,820  INFO org.apache.hadoop.mapreduce.JobSubmitter: 477 - Submitting tokens for job: job_1391711633872_0022
11:40:47,127  INFO      org.apache.hadoop.mapred.YARNRunner: 368 - Job jar is not present. Not adding any jar to the list of resources.
11:40:47,225  INFO doop.yarn.client.api.impl.YarnClientImpl: 174 - Submitted application application_1391711633872_0022 to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:47,291  INFO          org.apache.hadoop.mapreduce.Job:1272 - The url to track the job: http://http://hd-dn-01.grcrtp.local:8088/proxy/application_1391711633872_0022/
11:40:47,292  INFO          org.apache.hadoop.mapreduce.Job:1317 - Running job: job_1391711633872_0022
11:40:50,330  INFO          org.apache.hadoop.mapreduce.Job:1338 - Job job_1391711633872_0022 running in uber mode : false
11:40:50,332  INFO          org.apache.hadoop.mapreduce.Job:1345 -  map 0% reduce 0%
11:40:50,356  INFO          org.apache.hadoop.mapreduce.Job:1358 - Job job_1391711633872_0022 failed with state FAILED due to: Application application_1391711633872_0022 failed 2 times due to AM Container for appattempt_1391711633872_0022_000002 exited with  exitCode: 1 due to: Exception from container-launch: 
org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

    at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
    at org.apache.hadoop.util.Shell.run(Shell.java:379)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

.Failing this attempt.. Failing the application.
11:40:50,434  INFO          org.apache.hadoop.mapreduce.Job:1363 - Counters: 0
11:40:50,470  INFO ramework.data.hadoop.mapreduce.JobRunner: 202 - Completed job [wordcountJob]
11:40:50,507  INFO    org.apache.hadoop.yarn.client.RMProxy:  56 - Connecting to ResourceManager at hd-dn-01.grcrtp.local/10.6.64.232:8050
11:40:50,590  INFO ctory.support.DefaultListableBeanFactory: 444 - Destroying singletons in org.springframework.beans.factory.support.DefaultListableBeanFactory@74ab6b5: defining beans [org.springframework.context.support.PropertySourcesPlaceholderConfigurer#0,hadoopConfiguration,wordcountJob,runner]; root of factory hierarchy
trisberg commented 10 years ago

How are you building and running the example? The batch file that is generated is built via Maven Appassembler plug-in. They don't seem to work right if you have a deep directory structure - got some error about command being too long.

Also see some other error - Could not resolve placeholder 'app.home' - so not sure how well these generated batch files actually work. Not sure if anyone has run these examples successfully on Widows.

Is your Hadoop cluster on Windows as well?

trisberg commented 10 years ago

I just updated the samples to use $basedir instead of $app.home since the generated batch file for Windows doesn't set the app.home system property. Ran the wordcount sample successfully on Windows.

pooleja commented 10 years ago

I had edited the batch file so the class path was defined as:

set CLASSPATH="%BASEDIR%"\etc;"%REPO%"\*

This allowed all the files to be on the class path without making the command too long.

I found related open bugs: https://issues.apache.org/jira/browse/YARN-1298 https://issues.apache.org/jira/browse/MAPREDUCE-4052

pooleja commented 10 years ago

My hadoop cluster is a Linux based one. I am trying to go from a Windows client to a Linux cluster.

trisberg commented 10 years ago

Nice find on that bug. My test ran fine since I was running against a Windows based Hadoop cluster, haven't tried going from Windows client to Linux cluster.

sumanth35 commented 6 years ago

Hi I have installed hadoop 2.7.4 on windows 7. I tried to run the spring hadoop map reduce wordcount program but could not run on windows as sh ./target/appassembler/bin/wordcount cannot be run on windows.

When I tried to run the wordcount class as a standalone class I get the following exception:

log4j:WARN No appenders could be found for logger (org.springframework.context.support.ClassPathXmlApplicationContext). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Exception in thread "main" org.springframework.beans.factory.BeanDefinitionStoreException: Invalid bean definition with name 'wordcountJob' defined in null: Could not resolve placeholder 'app.repo' in string value "file:${app.repo}/hadoop-examples-*.jar"; nested exception is java.lang.IllegalArgumentException: Could not resolve placeholder 'app.repo' in string value "file:${app.repo}/hadoop-examples-*.jar" at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:211) at org.springframework.context.support.PropertySourcesPlaceholderConfigurer.processProperties(PropertySourcesPlaceholderConfigurer.java:180) at org.springframework.context.support.PropertySourcesPlaceholderConfigurer.postProcessBeanFactory(PropertySourcesPlaceholderConfigurer.java:155) at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:265) at org.springframework.context.support.PostProcessorRegistrationDelegate.invokeBeanFactoryPostProcessors(PostProcessorRegistrationDelegate.java:162) at org.springframework.context.support.AbstractApplicationContext.invokeBeanFactoryPostProcessors(AbstractApplicationContext.java:606) at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:462) at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:197) at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:172) at org.springframework.context.support.ClassPathXmlApplicationContext.<init>(ClassPathXmlApplicationContext.java:158) at org.springframework.samples.hadoop.mapreduce.Wordcount.main(Wordcount.java:28) Caused by: java.lang.IllegalArgumentException: Could not resolve placeholder 'app.repo' in string value "file:${app.repo}/hadoop-examples-*.jar" at org.springframework.util.PropertyPlaceholderHelper.parseStringValue(PropertyPlaceholderHelper.java:174) at org.springframework.util.PropertyPlaceholderHelper.replacePlaceholders(PropertyPlaceholderHelper.java:126) at org.springframework.core.env.AbstractPropertyResolver.doResolvePlaceholders(AbstractPropertyResolver.java:204) at org.springframework.core.env.AbstractPropertyResolver.resolveRequiredPlaceholders(AbstractPropertyResolver.java:178) at org.springframework.context.support.PropertySourcesPlaceholderConfigurer$2.resolveStringValue(PropertySourcesPlaceholderConfigurer.java:175) at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveStringValue(BeanDefinitionVisitor.java:282) at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveValue(BeanDefinitionVisitor.java:209) at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitList(BeanDefinitionVisitor.java:228) at org.springframework.beans.factory.config.BeanDefinitionVisitor.resolveValue(BeanDefinitionVisitor.java:192) at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitPropertyValues(BeanDefinitionVisitor.java:141) at org.springframework.beans.factory.config.BeanDefinitionVisitor.visitBeanDefinition(BeanDefinitionVisitor.java:82) at org.springframework.beans.factory.config.PlaceholderConfigurerSupport.doProcessProperties(PlaceholderConfigurerSupport.java:208) ... 10 more

How can i run this program?

Please advise

sumanth35 commented 6 years ago

Figured this out by providing the complete path