Open tmarrinan opened 4 weeks ago
OK - after a bit more reading and testing - I think I have it working!
There were 2 key things I needed to change (one with the mesh and one with the partition options):
Final solution:
conduit::Node mesh;
mesh["state/domain_id"] = process_id;
mesh["coordsets/coords/type"] = "uniform";
mesh["coordsets/coords/dims/i"] = local_width + 1;
mesh["coordsets/coords/dims/j"] = local_height + 1;
mesh["coordsets/coords/origin/x"] = origin_x;
mesh["coordsets/coords/origin/y"] = origin_y;
mesh["coordsets/coords/spacing/dx"] = 1;
mesh["coordsets/coords/spacing/dy"] = 1;
mesh["topologies/topo/type"] = "uniform";
mesh["topologies/topo/coordset"] = "coords";
mesh["fields/scalar1/association"] = "element";
mesh["fields/scalar1/topology"] = "topo";
mesh["fields/scalar1/values"].set(values, 16);
int i;
conduit::Node options, selections, output;
for (i = 0; i < num_processes; i++)
{
conduit::Node &selection = selections.append();
selection["type"] = "logical";
selection["domain_id"] = i;
selection["start"] = {0u, 0u, 0u};
selection["end"] = {local_width, local_height, 1u};
}
options["target"] = 1;
options["fields"] = {"scalar1"};
options["selections"] = selections;
options["mapping"] = 0;
conduit::blueprint::mpi::mesh::partition(mesh, options, output, MPI_COMM_WORLD);
This resulted in the following output (each process filled its local data with floating point values equal to its process id):
state:
domain_id: 0
coordsets:
coords:
type: "uniform"
origin:
x: 0.0
y: 0.0
dims:
i: 17
j: 13
topologies:
topo:
type: "uniform"
coordset: "coords"
fields:
scalar1:
topology: "topo"
association: "element"
values: [0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 9.0, 9.0, 9.0, 9.0, 10.0, 10.0, 10.0, 10.0, 11.0, 11.0, 11.0, 11.0, 8.0, 8.0, 8.0, 8.0, 9.0, 9.0, 9.0, 9.0, 10.0, 10.0, 10.0, 10.0, 11.0, 11.0, 11.0, 11.0, 8.0, 8.0, 8.0, 8.0, 9.0, 9.0, 9.0, 9.0, 10.0, 10.0, 10.0, 10.0, 11.0, 11.0, 11.0, 11.0, 8.0, 8.0, 8.0, 8.0, 9.0, 9.0, 9.0, 9.0, 10.0, 10.0, 10.0, 10.0, 11.0, 11.0, 11.0, 11.0]
original_element_ids:
topology: "topo"
association: "element"
values:
domains: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
ids: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 8, 9, 10, 11, 8, 9, 10, 11, 8, 9, 10, 11, 8, 9, 10, 11, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 8, 9, 10, 11, 8, 9, 10, 11, 8, 9, 10, 11, 8, 9, 10, 11, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 4, 5, 6, 7, 8, 9, 10, 11, 8, 9, 10, 11, 8, 9, 10, 11, 8, 9, 10, 11, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15, 12, 13, 14, 15]
original_vertex_ids:
topology: "topo"
association: "vertex"
values:
domains: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
ids: [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 8, 5, 6, 7, 8, 5, 6, 7, 8, 5, 6, 7, 8, 9, 10, 11, 12, 13, 10, 11, 12, 13, 10, 11, 12, 13, 10, 11, 12, 13, 14, 15, 16, 17, 18, 15, 16, 17, 18, 15, 16, 17, 18, 15, 16, 17, 18, 19, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 8, 5, 6, 7, 8, 5, 6, 7, 8, 5, 6, 7, 8, 9, 10, 11, 12, 13, 10, 11, 12, 13, 10, 11, 12, 13, 10, 11, 12, 13, 14, 15, 16, 17, 18, 15, 16, 17, 18, 15, 16, 17, 18, 15, 16, 17, 18, 19, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, 4, 5, 6, 7, 8, 5, 6, 7, 8, 5, 6, 7, 8, 5, 6, 7, 8, 9, 10, 11, 12, 13, 10, 11, 12, 13, 10, 11, 12, 13, 10, 11, 12, 13, 14, 15, 16, 17, 18, 15, 16, 17, 18, 15, 16, 17, 18, 15, 16, 17, 18, 19, 20, 21, 22, 23, 20, 21, 22, 23, 20, 21, 22, 23, 20, 21, 22, 23, 24]
I'm glad you got this working. Do you have suggestions on how we can improve the documentation?
The tricky part was realizing that which data came from which domain_id needed to be manually selected using an array of "selections" rather than just specifying the desired region and letting Conduit determine who owned that data.
There were no examples in the documentation with multiple selections, so it was a bit of trial-and-error. Having the code that matches the M:N redistribution in the picture (where target is 10, 4, and 2) might be helpful.
Well, now I'm running into a different issue.
If my data contains "ghost cells" (border cells that contain data from a neighbor), then I am receiving the following warning when repartitioning: Unable to combine domains as uniform, using unstructured.
In the example above:
+-----+-----+-----+-----+
| 0 | 1 | 2 | 3 |
| | | | |
+-----+-----+-----+-----+
| 4 | 5 | 6 | 7 |
| | | | |
+-----+-----+-----+-----+
| 8 | 9 | 10 | 11 |
| | | | |
+-----+-----+-----+-----+
I now have each process with ghost cells for its neighbors. This means the actual data size for each process is as follows (when overall grid is 16x12):
Accordingly, I update my "start" and "end" in each selection to account for the desired data sometimes being 1 cell to the right or down. I also update the "origin/{i,j}" of the coordset in the mesh.
Any ideas why the uniform domain cannot be maintained?
Wait, nevermind... I just realized that "end" is inclusive. It didn't matter without the ghost cells, since it would get cropped to the data size, but I was now grabbing the ghost cells to the right / below since I assumed "end" was exclusive
Conduit Blueprint currently has no notion of ghost cells or nodes, but that support will likely be added in the future.
We should enhance the documentation for partitioning and provide more and better examples.
Hello!
I have one more question relating to partitioning. I am now attempting to accomplish the same thing, but using Python instead of C++. I don't see many examples, but when I try output = conduit.blueprint.mpi.mesh.partition(mesh, options, comm)
, I get an error about conduit blueprint not having a member named "mpi".
How could I achieve the same thing in Python?
@cyrush can correct me if I'm wrong, but I don't believe MPI is enabled for the python interface for blueprint. I'm not sure why that's the case. We should add it.
read of the situation is correct, we can add that support.
Hello. I have data that is distributed amongst N processes and I want to create a blueprint mesh for it. I thought I was doing it correctly, but when I call the partition function, I am getting unexpected results. I'm not sure if I am creating the mesh wrong or calling the partition function wrong. Any assistance would be appreciated!
Example (12 processes, each owning a 4x4 subregion of an overall 16x12 grid):
Code:
I then want to repartition the mesh to access the whole thing on process 0. So I tried the following:
However, the resulting output mesh still only has size 4x4 and only contains the data from process 0.
As a side note, I am setting "target" to 1 (specifying 1 process), but how do I specify which process (i.e. what if I want it on process 3 instead of process 1)?