Get workers from yarn-site.xml page failed. Timedout

ilham9649 commented 7 years ago

Im having an issue on while running the workload. Every i am running the workload there is always popup error AssertionError: Get workers from yarn-site.xml page failed, reason:( /opt/hadoop/bin/yarn node -list 2> /dev/null | grep RUNNING ) executed timedout for 5 seconds

But i can use command "yarn node -list 2> /dev/null | grep RUNNING" and the result is " master:40482 RUNNING master:8042 ".

So ithink the problem is because the timeout is too quick for my computer execute that command for 5 second. How i can change the timeout ? Which file is contain timeout conf? THANKS

ilham9649 commented 7 years ago

ok i have solution on this problemn. just modify load_config.py search timeout=5 and replace the number. thx

lexpierce commented 7 years ago

That's not a valid solution. Too fiddly. Can the timeout be extended to 10? Or can that parameter be a tunable, as some busy and/or smaller clusters, or cluster with HA, are expected to be a touch longer on return.

n3rV3 commented 7 years ago

I am also facing this issue on Azure HDInsight clusters. I fear this might be happening on all Cloud hosted Hadoop clusters.

Can we please get this timeout increased to 10 secs?

ilham9649 commented 7 years ago

I change the timeout to 30 second and its work well. Just change the ltimeout parameter on load_config.py

Intel-bigdata / HiBench

Get workers from yarn-site.xml page failed. Timedout #412