Open omicron8 opened 5 years ago
I commented out oozie scheduler in app-conf/SchedulerConf.xm and the error has disappeared. But data is not saved into db and hence still nothing in web UI. Any ideas?
@omicron8 sorry for the late follow-up, can you let me know if you are still facing this issue?
@omicron8 sorry for the late follow-up, can you let me know if you are still facing this issue?
Yes, the problem still presents
Are you getting any exceptions in the dr_elephant.log or dr.log or logs/application.log? Any type of application getting analyzed?
Nothing except the error above. We have only mr jobs. No one job has been analyzed.
Are you able to log messages like Analysis of MAPREDUCE application application_xxxxxxxx took XYZ ms
?
Are you able to log messages like
Analysis of MAPREDUCE application application_xxxxxxxx took XYZ ms
?
I did grep all logs and nothing found.
Is logs/elephant/dr_elephant.log
is getting populated(logs are written)? If yes, can you confirm if X is non-zero in Job queue size is **X**
log statement? It would be helpful if you can share your log file with masking all the private details like RM Address etc.
Yes, it's being populated. Here is line with job queue size: 02-03-2020 16:41:08 INFO [Thread-10] com.linkedin.drelephant.ElephantRunner : Job queue size is 150726
This shows that Dr.Elephant is able to fetch finished applications but they are not getting processed which can be confirmed if the Y in Second Retry queue size is Y
is increasing.Try to look for log matching Drop the analytic job. Reason: reached the max retries for application id = [XYZ]
.
Also what type of jobs are you analyzing?
Yep, I see messages like that dr-elephant-2.1.7/logs/elephant/dr_elephant.log.2020-01-04:01-04-2020 04:58:48 ERROR [dr-el-executor-thread-0] com.linkedin.drelephant.ElephantRunner : Drop the analytic job. Reason: reached the max retries for application id = [application_1575530104012_328247].
We analyze mr2 jobs only.
Can you provide some logs before and after the log you gave above?
hey, any news?
Are you able to log messages like
Analysis of MAPREDUCE application application_xxxxxxxx took XYZ ms
?
Again. If I commented out oozie scheduler in app-conf/SchedulerConf.xml I can see messages like this 02-26-2020 15:19:33 INFO [dr-el-executor-thread-2] com.linkedin.drelephant.ElephantRunner : Analyzing MAPREDUCE application_1581418441444_240970
But database is still empty.
If I uncomment oozie scheduler in app-conf/SchedulerConf.xml I get the following error:
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Cannot add or update a child row: a foreign key constraint fails (drele_prod
.yarn_app_heuristic_result
, CONSTRAINT yarn_app_heuristic_result_f1
FOREIGN KEY (yarn_app_result_id
) REFERENCES yarn_app_result
(id
))
Keep oozie scheduler in app-conf/SchedulerConf.xml commented out, and then attach the respective logs here, if your application is still not getting processed then there must be some error and need to find that out from logs.
Here is log for last hour dr_elephant.log
@omicron8 I can't see any errors in the log provided, but seems like you have made few changes to the code. Can you provide the diff of changes you made?
I have a lot of errors in /logs/elephant/dr_elephant.log and therefore I have no processed jobs in web UI