SWIMProjectUCB / SWIM

Statistical Workload Injector for MapReduce - Project at UC Berkeley AMP Lab
https://github.com/SWIMProjectUCB/SWIM/wiki
129 stars 94 forks source link

parse-hadoop-jobhistory.pl doesn't work #11

Open zhangbbo opened 10 years ago

zhangbbo commented 10 years ago

Hello everyone,

I used SWIM to test my MR cluster. But when I finished the execution and want to use "parse-hadoop-jobhistory.pl" to analyse the job logs, I found that it doesn't work.

I tried it in Centos6 and Ubuntu12.04. But it doesn't work. I follow the guide "Step 1. Parse historical Hadoop logs"

And hope you can check it again.

Thanks

ZHANG Bo

yanpeichen commented 10 years ago

Hi Zhang Bo,

Glad to help. What command are you running and what error are you seeing? The script was originally written for MR1 job history logs.

Cheers, Yanpei.

zhangbbo commented 10 years ago

Hello,

I run "perl parse-hadoop-jobhistory.pl workLog > test.txt" All the logs created by script "run-jobs-all.ssh" are put into directory "workLog". And I put the file "parse-hadoop-jobhistory.pl" into the same path with "workLog" (the parse file and the directory are in the same path).

When I run the above command, the terminal show nothing. In "test.txt", it's also empty. The MR framework I used is MR2 or Yarn. But I think the problem is not from which version of MR used, because this script is just to treat the log files.

In fact, I looked into this script a little. It can correctly find each log file in "workLog", but there is a variable $jobs{$job_id}{"status"} is always empty. I think maybe it can't correctly pick out the information from the logs. But I don't understand perl very well, so I give up.

I hope this will help you.

Cheers

ZHANG Bo

----- Mail original -----

De: "yanpeichen" notifications@github.com À: "SWIMProjectUCB/SWIM" SWIM@noreply.github.com Cc: "ZHANG Bo" bo.zhang@inria.fr Envoyé: Vendredi 8 Août 2014 21:03:01 Objet: Re: [SWIM] parse-hadoop-jobhistory.pl doesn't work (#11)

Hi Zhang Bo,

Glad to help. What command are you running and what error are you seeing? The script was originally written for MR1 job history logs.

Cheers, Yanpei.

— Reply to this email directly or view it on GitHub .

pfxuan commented 10 years ago

Hi Bo,

The parse-hadoop-jobhistory.pl should work fine with Hadoop v1.x. Hadoop v1.x and Hadoop v2.x/YARN use different log structures. And also, MRv1 and MRv2 have different execution behaviours and log formats. I suggest that you should try MRv1+Hadoop 1.x first, and then MRv1+Hadoop 2.x without being started by YARN. After getting a correct result, finally you can manually compare the difference among your experiments.

Best, Pengfei

ghost commented 7 years ago

Hello YanPei,

Do you have the results like FacebookTrace.tsv after running parse-hadoop-jobhistory.pl against Facebook Logs?

Looking forward to receiving your reply.

Thanks very much Yudi