pmodels / mpich

Official MPICH Repository
http://www.mpich.org
Other
535 stars 280 forks source link

Enable SMP aware collectives via PAMI local task detection #1842

Closed mpichbot closed 7 years ago

mpichbot commented 7 years ago

Originally by blocksom on 2013-05-08 09:26:06 -0500


This ticket is to discuss a commit that touches files in the src/mpi/coll directory - in addition to several pamid changes.

mpichbot commented 7 years ago

Originally by blocksom on 2013-05-08 09:26:53 -0500


I pushed the commit to the mpich-ibm/ticket-1842 branch.

mpichbot commented 7 years ago

Originally by balaji on 2013-05-08 10:05:16 -0500


Why are you trying to restrict the maximum message size for the SMP collectives? Can you give some background on this?

mpichbot commented 7 years ago

Originally by archerc on 2013-05-08 11:07:47 -0500


At large message sizes on IB networks bandwidth of the SMP collectives is quite bad, sometimes 2x worse than non-smp. The cutoffs were put in place to obtain the best latency and bandwidth. This cutoff is tunable by environment variable and controlled by params.yml, so the defaults can be adjusted to maximum count if no cutoffs are desired for your underlying network.

mpichbot commented 7 years ago

Originally by balaji on 2013-05-09 21:56:47 -0500


I don't want to change the default value directly that way. But we do want to allow devices/channels customize the values as needed (that's one of the primary goals of MPICH). In fact, we should do this for all MPI-level parameters consistently, not just the subset provided in your patch.

I'd like to propose the following --

  1. Split out the params file into multiple segments, and each device/channel/netmod can define its own subset of parameters. With this change there will be a params.yml in src/mpi/param. There'll be additional params.yml files in src/mpid/ch3/param and src/mpid/pamid/param. Within ch3, we might have an additional params.yml file in src/mpid/ch3/channels/nemesis/param, etc. All these will be listed in autogen.sh and will be combined at autogen.sh time.
  2. In the "default" field, we can allow a new type of the form "device|". This means that autogen.sh will generate code of the form:
if (environment "MPIR_PARAM_FOOBAR" exists)
   MPIR_PARAM_FOOBAR = atoi(ENVIRONMENT VARIABLE);
else {
#if defined MPID_PARAM_FOOBAR
   MPIR_PARAM_FOOBAR = MPID_PARAM_FOOBAR;
#else
   MPIR_PARAM_FOOBAR = <value>;
#endif /* MPID_PARAM_FOOBAR */
}

This will allow the device to override the default value as needed.

  1. We'll use an INFINITE size default value for the SMP collectives in src/mpi/param/params.yml.
  2. pamid can add MPID_PARAM_MAX_SMP_BCAST_SIZE, etc. into your device mpidpre.h. This will override the default value used for your device.

Does this approach make sense?

I'm on travel this week, but if we agree that this approach will work for everyone, I can try to hack this up this weekend.

mpichbot commented 7 years ago

Originally by blocksom on 2013-05-15 12:52:32 -0500


I rebased branch mpich-ibm/ticket-1842 to mpich/master.

mpichbot commented 7 years ago

Originally by balaji on 2013-05-15 18:48:09 -0500


I didn't implement the first item above, since it didn't seem necessary after appropriate naming and segregation of different categories of parameters. The remaining patches have been pushed into mpich-ibm/ticket-1842-alt. Please review and let me know, so I can push it into mpich/master.

mpichbot commented 7 years ago

Originally by archerc on 2013-05-17 12:50:11 -0500


I have reviewed and agree with this change.

mpichbot commented 7 years ago

Originally by balaji on 2013-05-17 16:20:52 -0500


These changes have been pushed into mpich/master.

Relevant commits: [cf31e37a], [cbcde2f4], [e279b1ae], [60a89bbd], [4456bbec], [430514fd], [c6084635], [fa950cb5].

mpichbot commented 7 years ago

Originally by archerc on 2013-05-20 10:50:59 -0500


In build testing of this patch, I discovered a chunk of glue code to the pami extension was not forward ported. Can we get this patched in for pamid?

Thanks

From 5e77b35d298b5136d93b2f4785c4a73379dfb6cc Mon Sep 17 00:00:00 2001
From: Charles Archer <archerc@us.ibm.com>
Date: Mon, 20 May 2013 06:56:58 -0400
Subject: [PATCH] PAMID device code for local task detection

---
 src/mpid/pamid/src/mpidi_util.c | 51 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/src/mpid/pamid/src/mpidi_util.c b/src/mpid/pamid/src/mpidi_util.c
index ee8eda9..2bd89db 100644
--- a/src/mpid/pamid/src/mpidi_util.c
+++ b/src/mpid/pamid/src/mpidi_util.c
@@ -1908,3 +1908,54 @@ void MPIDI_collsel_pami_tune_cleanup()
 }

+#undef FUNCNAME
+#define FUNCNAME MPID_Get_node_id
+#undef FCNAME
+#define FCNAME MPIDI_QUOTE(FUNCNAME)
+static int _g_max_node_id = -1;
+int MPID_Get_node_id(MPID_Comm *comm, int rank, MPID_Node_id_t *id_p)
+{
+  int mpi_errno = MPI_SUCCESS;
+  uint32_t node_id;
+  uint32_t offset;
+  uint32_t max_nodes;
+  if(!PAMIX_Extensions.is_local_task.node_info)
+    MPIU_ERR_SETANDJUMP(mpi_errno, MPI_ERR_OTHER, "**notimpl");
+
+  pami_result_t rc = PAMIX_Extensions.is_local_task.node_info(comm->vcr[rank]->taskid,
+                                                              &node_id,&offset,&max_nodes);
+  if(rc != PAMI_SUCCESS)  MPIU_ERR_SETANDJUMP(mpi_errno, MPI_ERR_OTHER, "**notimpl");
+  *id_p = node_id;
+
+  fn_fail:
+  return mpi_errno;
+}
+
+#undef FUNCNAME
+#define FUNCNAME MPID_Get_max_node_id
+#undef FCNAME
+#define FCNAME MPIDI_QUOTE(FUNCNAME)
+int MPID_Get_max_node_id(MPID_Comm *comm, MPID_Node_id_t *max_id_p)
+{
+  int mpi_errno = MPI_SUCCESS;
+  uint32_t node_id;
+  uint32_t offset;
+  uint32_t max_nodes;
+  if(_g_max_node_id == -1)
+  {
+    if(!PAMIX_Extensions.is_local_task.node_info)
+      MPIU_ERR_SETANDJUMP(mpi_errno, MPI_ERR_OTHER, "**notimpl");
+
+    pami_result_t rc = PAMIX_Extensions.is_local_task.node_info(comm->vcr[0]->taskid,
+                                                                &node_id,&offset,&max_nodes);
+    if(rc != PAMI_SUCCESS)  MPIU_ERR_SETANDJUMP(mpi_errno, MPI_ERR_OTHER, "**notimpl");
+    *max_id_p = max_nodes;
+    _g_max_node_id = max_nodes;
+  }
+  else
+    *max_id_p = _g_max_node_id;
+
+  fn_fail:
+  return mpi_errno;
+}
+#undef FUNCNAME
-- 
1.8.1-rc1
mpichbot commented 7 years ago

Originally by balaji on 2013-05-20 21:59:54 -0500


Pushed to mpich/master in [c3a8a817].

mpichbot commented 7 years ago

Originally by blocksom on 2013-05-21 09:04:35 -0500


I pushed a couple bgq compile fixes to mpich-ibm/ticket-1842-alt2. Without these commits the bgq compile of mpich/master fails.

mpichbot commented 7 years ago

Originally by blocksom on 2013-05-21 09:28:36 -0500


I deleted the branches mpich-ibm/ticket-1842 and mpich-ibm/ticket-1842-alt.

mpichbot commented 7 years ago

Originally by balaji on 2013-05-21 09:48:38 -0500


Pushed to mpich/master.

Relevant commits: [438a49fe], [f44e3d4a], [038aa7cd].

mpichbot commented 7 years ago

Originally by blocksom on 2013-05-21 09:58:20 -0500


Deleted branch mpich-ibm/ticket-1842-alt2.