LinkedInAttic / white-elephant

Hadoop log aggregator and dashboard
Other
191 stars 63 forks source link

Not finding log files #3

Closed SwathiMystery closed 11 years ago

SwathiMystery commented 11 years ago

When I run the following command, after making changes to cfg.pm ./statsupload.pl --config cfg.pm I see that ...... .... Searching /var/log/hadoop/logs/history for logs ./statsupload.pl: No such file or directory [/var/log/hadoop/logs/history/done/ec2-XXXXXXXXX.amazonaws.com1363721010255/2013/03/19/000000/job_201303191923_0004_conf.xml] ....

However, I observe that the log files are of format $ cd /var/log/hadoop/logs/history/done/ec2-XXXXXXXXXXXX1363721010255/2013/03/19/000000 $ ls ec2-XXXXXXXXX.amazonaws.com_1363721010255_job_201303191923_0002_conf.xml

Have I missed any configuration? Why is it not searching for ec2-XXXXXXXXX.amazonaws.com_1363721010255_job_201303191923_0002_conf.xml ?

Any help is appreciated in this regard.

Thank You.

matthayes commented 11 years ago

I see, so it appears your job conf xml files have a different naming convention than the script expects. As a workaround you could just comment out lines 285-297 since the job xml files are not actually used at the moment. Can you let me know if this fixes it for you? I think this is where the script is probably failing. I'll fix the script so it finds the conf files correctly. Thanks for your patience :)

SwathiMystery commented 11 years ago

Sure. I am trying to test the tool on 1+6 node cluster on cloud. Just wanted to make sure everything works and my understanding is right, before deploying to a larger cluster. I am pretty much interested in this tool and would like to contribute going forward.

I commented the lines in statsupload.pl from 285-297, as you suggested and I see the same issue. AFAIK, the logs generated will be of this pattern in the folder /var/log/hadoop/logs/history/done// Further down, it will be of format year/month/day and then, 000000/ here, comes all conf.xml files. But, has a prefix of for the job as follows

_job_201303191923_0042_conf.xml Also, checking of HDFS files is returning 0, even if there are files. I'm using CDH4 mrv1. Right now, I've whiteelephant deployed on Namenode. I'm afraid, if I have missed any configuration or doing something wrong. Thanks in advance.
matthayes commented 11 years ago

I was able to reproduce your problem. I missed a line you need to change as a workaround. Comment out the "findqueue" line like this:

    # $queue = findqueue( $xml );
    $queue = "default";

It's getting the queue name from the job conf xml and since the script can't find it it's failing.

SwathiMystery commented 11 years ago

Awesome! setting it to default is able to put the files. However, while checking HDFS data, it says Found 0 existing files in HDFS. Anything not configured?

matthayes commented 11 years ago

I update the script to be more flexible with the job conf xml names. Can you try it again? Under your scenario it should now upload the job conf xml.

It also now logs the command used to list files in HDFS. You can you use this to double check why it isn't finding anything after running the first time.

matthayes commented 11 years ago

Also make sure you've updated "days" in cfg.pm to be at least older than your log files being uploaded or it won't search those days in HDFS.