Yelp / mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
http://packages.python.org/mrjob/
Other
2.62k stars 586 forks source link

progress indicators are wrong when steps run simultaneously #2195

Open coyotemarin opened 4 years ago

coyotemarin commented 4 years ago

_parse_progress_from_resource_manager() assumes that there will be at most one job running on a cluster at the same time, which is wrong now that clusters can run steps concurrently.

If we know a step's StartTime from the ListSteps API, that seems to only be a few seconds off of Start Time in the resource manager UI. So that's a way we could possibly match up step progress correctly.

It would be really nice if there EMR API would tell us the mapping between EMR step IDs and YARN application IDs, but so far I haven't found one.

coyotemarin commented 4 years ago

Since we now have code to talk to the resource manager API, we can guess the application ID for the step from the apps API (based on start time) and then get its progress from the app API.