Open alexanderfahlke opened 11 years ago
I just now pushed some improvements to the upload script. Can you try it again and share the output?
The "uninitialized value" error was happening because it didn't find the job conf xml corresponding to the log file. The script now catches this instead of failing.
The real issue was that the log file names are in a different format than the script expects. I changed it so it is more flexible, looking for the part of the file starting with "job". Yours start with "localhost", which confused the script. Anyways this should work now.
I also added more logging so it's easier to tell what's happening. So if it still doesn't work after this it should be clearer why :)
That did the trick! Now I've got 98 files in HDFS (49 .xml
and 49 .log
).
But the log output is a bit misleading because it says that the script is uploading .pig
and .jar
files.
...
Uploading /home/hadoop/bin/hadoop/logs/history/localhost_1363694800974_job_201303191306_0008_hadoop_PigLatin%3Ahdfsdu.pig
-> hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303191306_0008.log
command: /home/hadoop/bin/hadoop/bin/hadoop dfs -put /home/hadoop/bin/hadoop/logs/history/localhost_1363694800974_job_201303191306_0008_hadoop_PigLatin%3Ahdfsdu.pig hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303191306_0008.log
Uploading /home/hadoop/bin/hadoop/logs/history/localhost_1363694800974_job_201303191306_0008_conf.xml
-> hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303191306_0008_conf.xml
command: /home/hadoop/bin/hadoop/bin/hadoop dfs -put /home/hadoop/bin/hadoop/logs/history/localhost_1363694800974_job_201303191306_0008_conf.xml hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303191306_0008_conf.xml
Uploading /home/hadoop/bin/hadoop/logs/history/localhost_1363721823970_job_201303192037_0003_hadoop_Job6321920289122417517.jar
-> hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303192037_0003.log
command: /home/hadoop/bin/hadoop/bin/hadoop dfs -put /home/hadoop/bin/hadoop/logs/history/localhost_1363721823970_job_201303192037_0003_hadoop_Job6321920289122417517.jar hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303192037_0003.log
Uploading /home/hadoop/bin/hadoop/logs/history/localhost_1363721823970_job_201303192037_0003_conf.xml
-> hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303192037_0003_conf.xml
command: /home/hadoop/bin/hadoop/bin/hadoop dfs -put /home/hadoop/bin/hadoop/logs/history/localhost_1363721823970_job_201303192037_0003_conf.xml hdfs://localhost:9000/user/hadoop/history/test/daily/default/2013/0319/job_201303192037_0003_conf.xml
...
But there are (as expected) no pigs and jars stored in HDFS:
...
-rw-r--r-- 1 hadoop supergroup 7208 2013-03-24 01:54 /user/hadoop/history/test/daily/default/2013/0319/job_201303191306_0008.log
-rw-r--r-- 1 hadoop supergroup 114619 2013-03-24 01:54 /user/hadoop/history/test/daily/default/2013/0319/job_201303191306_0008_conf.xml
...
-rw-r--r-- 1 hadoop supergroup 4619 2013-03-24 01:54 /user/hadoop/history/test/daily/default/2013/0319/job_201303192037_0003.log
-rw-r--r-- 1 hadoop supergroup 176327 2013-03-24 01:54 /user/hadoop/history/test/daily/default/2013/0319/job_201303192037_0003_conf.xml
...
I'll try making this more clear. The script renames the log files using a more consistent naming convention. The .jar and .pig files are actually log files, so they should end in .log ;)
Ah cool, I didn't knew that. Every day learning something new...
So this is almost fixed (except the confusion with the names of the log files).
Hi All,
I have done all the changes in the files as per my cluster instances. But i am not getting the steps as how to work with white-elephant. can you please list me down the steps to execute it in hadoop.
when i tried i m getting the following error.
[xxx@vp21q39ic-hpao101328 ~]$ ls __MACOSX white-elephant-master white-elephant-master.zip [xxx@clusterinstance ~]$ cd white-elephant-master [xxx@clusterinstance white-elephant-master]$ cd hadoop/scripts/ [xxx@clusterinstance scripts]$ ls README.md cfg.pm statsupload.pl [xxx@clusterinstance scripts]$ ./statsupload.pl cfg.pm perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Can't locate Date/Calc.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at ./statsupload.pl line 11. BEGIN failed--compilation aborted at ./statsupload.pl line 11.
Please help me with the detailed steps. In http://data.linkedin.com/opensource/white-elephant this page i am seeing some deployment steps and all, when all these should be done..
Confused :(
Thanks
While uploading the logfiles I get the strange error:
My config:
My hadoop logfiles (for testing, 49 in total):
The final output of the script is:
If I check that in HDFS I only see 3 files:
I'm using: