MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.26k stars 21.43k forks source link

How to determine limits of the "Number of Containers"? #16074

Closed jpchauhan closed 6 years ago

jpchauhan commented 6 years ago

When we say "the cluster has a fixed limit on the number of containers available". How do we determine that?


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Alberto-Vega commented 6 years ago

@jpchauhan Thanks for the feedback! We are currently investigating and will get back to you shortly.

mamccrea commented 6 years ago

@jpchauhan - It depends on the available resources. YARN uses a global ResourceManager (RM), per-worker-node NodeManagers (NMs), and per-application ApplicationMasters (AMs). The per-application AM negotiates resources (CPU, memory, disk, network) for running your application with the RM. The RM works with NMs to grant these resources, which are granted as containers.

JasonWHowell commented 6 years ago

@jpchauhan One other point: when you create an HDInsight cluster, the Virtual Machine size is part of the choices you can make in the Azure portal. The amount of memory and CPUs available in the cluster is a factor of which size of VMs you pick and how many worker nodes you pick. The price varies based on the VM size and number of VMs. More details https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-capacity-planning

For example, in East US I see these choices: image

When the various YARN services accept the incoming work, and assign containers to do work on those worker nodes, it monitors the available resources to assign containers to host the work and divide the cluster resources.

There are a number of knobs, depending on the workload, to fine tune performance, which will influence container sizes and number of concurrent jobs that can run. https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-changing-configs-via-ambari

That's a lot to digest, but start with a small cluster to test with, and to see if performance is suitable before fine tuning and increasing the cluster size in additional tests.

Thanks! Jason

please-close