stephenlienharrell / WeatherPipe

A MapReduce pipeline for the analysis of the NEXRAD data set in S3 - Purdue CS307 Project
MIT License
23 stars 6 forks source link

WeatherPipe in the latest aws #5

Open aloukianov opened 6 years ago

aloukianov commented 6 years ago

Hi, would you tell me if you have a chance to support the project yet? Kind regards, andrei

PS. I'm trying to write my own analysis in Eclipse Oxygen and have an issue with "hadoop home". Please find the error message: 1 [main] DEBUG org.apache.hadoop.util.Shell - Failed to detect a valid hadoop home directory java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set. at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:265) at org.apache.hadoop.util.Shell.(Shell.java:290) at org.apache.hadoop.util.StringUtils.(StringUtils.java:76) at org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1751) at edu.purdue.eaps.weatherpipe.weatherpipemapreduce.WeatherPipeMapReduce.main(WeatherPipeMapReduce.java:24) 11 [main] ERROR org.apache.hadoop.util.Shell - Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333) at org.apache.hadoop.util.Shell.(Shell.java:326) at org.apache.hadoop.util.StringUtils.(StringUtils.java:76) at org.apache.hadoop.conf.Configuration.setStrings(Configuration.java:1751) at edu.purdue.eaps.weatherpipe.weatherpipemapreduce.WeatherPipeMapReduce.main(WeatherPipeMapReduce.java:24) 673 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, always=false, sampleName=Ops, type=DEFAULT, value=[Rate of successful kerberos logins and latency (milliseconds)], valueName=Time) 686 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, always=false, sampleName=Ops, type=DEFAULT, value=[Rate of failed kerberos logins and latency (milliseconds)], valueName=Time) 686 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(about=, always=false, sampleName=Ops, type=DEFAULT, value=[GetGroups], valueName=Time) 688 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics 786 [main] DEBUG org.apache.hadoop.security.authentication.util.KerberosName - Kerberos krb5 configuration not found, setting default realm to empty 790 [main] DEBUG org.apache.hadoop.security.Groups - Creating new Groups object 794 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library... 797 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 797 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=C:\Program Files\Java\jre1.8.0_161\bin;C:\WINDOWS\Sun\Java\bin;C:\WINDOWS\system32;C:\WINDOWS;C:/Program Files/Java/jre1.8.0_161/bin/server;C:/Program Files/Java/jre1.8.0_161/bin;C:/Program Files/Java/jre1.8.0_161/lib/amd64;C:\Program Files\Microsoft MPI\Bin\;C:\ProgramData\Oracle\Java\javapath;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\Program Files\Common Files\Autodesk Shared\;C:\Python35\Scripts;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files\dotnet\;C:\Program Files\Anaconda3;C:\Program Files\Anaconda3\Scripts;C:\Program Files\Anaconda3\Library\bin;C:\Program Files\Java\jdk1.8.0_121\bin;C:\tools\apache-maven-3.5.2\bin;C:\WINDOWS\system32\config\systemprofile\MawsonKeyStorage;C:\tools\apache-maven-3.5.2\bin;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\;C:\Program Files\Microsoft SQL Server\140\Tools\Binn\;C:\Program Files\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\;C:\Program Files\Amazon\AWSCLI\;C:\Gradle\gradle-4.6\bin;C:\Users\papa\AppData\Local\Programs\Python\Python36\Scripts\;C:\Users\papa\AppData\Local\Programs\Python\Python36\;C:\Users\papa\AppData\Local\Microsoft\WindowsApps;;C:\Users\papa\Downloads\eclipse-jee-oxygen-3-win32-x86_64\eclipse;;. 797 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 797 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Falling back to shell based 797 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping 798 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 805 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login 805 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login commit 811 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - using local user:NTUserPrincipal: papa 812 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - UGI loginUser:papa (auth:SIMPLE) Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 at edu.purdue.eaps.weatherpipe.weatherpipemapreduce.WeatherPipeMapReduce.main(WeatherPipeMapReduce.java:39)

stephenlienharrell commented 6 years ago

Hello Andrei, I haven't ever attempted to run this on Windows, however, it looks like you have not set your hadoop home. From this page there is an example of how to set it: https://stackoverflow.com/questions/19620642/failed-to-locate-the-winutils-binary-in-the-hadoop-binary-path

System.setProperty("hadoop.home.dir", "C:\winutil\"); reference : stackoverflow.com/a/33610936/3110474 – Himanshu Bhandari Jan 6 '16 at 7:01

Hope this helps, that page probably has a lot of good information about running Hadoop on Windows.

Good luck! -stephen

aloukianov commented 6 years ago

Hi Stephen,

Thank you for the ref (not resolved yet). I got all sort of things with Win 10, Java 6 Hadoop, Gradle 5, Maven rep, Java 9, Eclipse Oxigen Java 8 etc. There is an error on log4j which I can't figure out. It seems the error related to a previous version of the log4j.

Could you find a minute to advise?

Kind regards Andrei log4j:ERROR Could not read configuration file [/C:/WeatherEclipse/WeatherPipe/WeatherPipeMapReduce/bin/main/log4j.properties]. java.io.FileNotFoundException: C:\WeatherEclipse\WeatherPipe\WeatherPipeMapReduce\bin\main\log4j.properties (The system cannot find the file specified) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(Unknown Source) at java.io.FileInputStream.(Unknown Source) at java.io.FileInputStream.(Unknown Source) at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:372) at org.apache.log4j.PropertyConfigurator.configure(PropertyConfigurator.java:403) at edu.purdue.eaps.weatherpipe.AWSAnonInterface.(AWSAnonInterface.java:28) at edu.purdue.eaps.weatherpipe.WeatherPipe.(WeatherPipe.java:39) log4j:ERROR Ignoring configuration file [/C:/WeatherEclipse/WeatherPipe/WeatherPipeMapReduce/bin/main/log4j.properties]. log4j:WARN No appenders could be found for logger (com.amazonaws.AmazonWebServiceClient). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Missing Option Error: org.apache.commons.cli.MissingOptionException: start_time is a required flag or setting in the config file usage: WeatherPipe [-b ] [-c ] [-e ] [-h] [-i ] [-id

] [-s ] [-st ] [-t ] -b,--bucket_name Bucket name in S3 to place input and output data. Will be auto-generated if not given -c,--config_file Location of config file -e,--end_time End search boundary for NEXRAD data search. Date Format is dd/MM/yyyy HH:mm:ss -h,--help Print this help message -i,--instance_count The amount of instances to run the analysis on. Default is 1. -id,--job_id Name of this particular job, a random one will be generated if not given. This must be unique in reference to other jobs. -s,--start_time Start search boundary for NEXRAD data search. Date Format is dd/MM/yyyy HH:mm:ss -st,--station Radar station abbreviation ex. "KIND" -t,--instance_type Instance type for EMR job. Default is c3.xlarge. See options here: https://aws.amazon.com/elasticmapreduce/prici ng/ Please report issues at https://github.com/stephenlienharrell/WeatherPipe/issues