Closed dajuno closed 4 months ago
I created #28 which fixes the HYMLS tests on multiple cores, in case you're interested what was still failing on that side. After that PR nothing fails anymore.
I also created #29 to show how I want to integrate the PETSc tests into the continuation tests. That doesn't have to be done in this PR though.
It looks like things go very wrong when running the tests in parallel (everything becoming NaN). That should be fixed.
I debugged it a bit and it seems like ghostUpdate
is doing the wrong thing.
I think the LGMap is ignored, and that's what is causing the issue.
where do you see the NaNs? what test are you looking at specifically?
What I tried now was to run the first ldc test on 4 cores. Then I printed the state vector and the indices (real and ghost) and compared. The state just has increasing numbers in it, so the indices should match the values (-1). They don't, which is what is causing issues. I'm quite sure that the reason is that the LGMap is completely ignored, at least for the ghost communication.
The NaN values are from the later test, but I think it's good to focus on test_ldc
first.
Hm interesting. Curious that it works with 2 procs
It's probably because it gets cut in half in such a way that all indices are continuous. That doesn't happen if you divide it into four quarters.
Hm, not sure how to reproduce.
What I tried now was to run the first ldc test on 4 cores. Then I printed the state vector and the indices (real and ghost) and compared. The state just has increasing numbers in it, so the indices should match the values (-1). They don't, which is what is causing issues. I'm quite sure that the reason is that the LGMap is completely ignored, at least for the ghost communication.
This seems to work for me:
for iloc, iglob in enumerate(interface.map.indices):
with state.localForm() as lf:
print(f"#{interface.comm.rank}", iglob, lf[iloc])
assert iglob == lf[iloc] - 1
Actually, before this, I got an error with A.getRow(i)
function in extract_sorted_row
, which fails on procs that don't own i
. Had to wrap the function body in if i in range(*A.getOwnershipRange()): ... else: return [], []
.
If ghosts, then maybe a mistake in the slightly awkward index sorting and rearranging for discretization
. In PETSc, the ghosts are appended to the vector after the real values, which makes this transformation necessary.
--
I see test_ldc
failing on 4 procs when comparing the following entries of the jacobian (values_A[j] == values_B[j]
):
FAILED tests/test_PETSc.py::test_ldc - AssertionError: rank 1: i = 96, j = 3, 1.6275 == 2.6275
FAILED tests/test_PETSc.py::test_ldc - AssertionError: rank 2: i = 128, j = 0, 1.15875 == 2.15875
FAILED tests/test_PETSc.py::test_ldc - AssertionError: rank 3: i = 224, j = 3, 5.6275 == 6.6275
FAILED tests/test_PETSc.py::test_ldc - AssertionError: rank 0: i = 16, j = 4, -1.28125 == -0.78125
probably not a coincidence how these are all multiples of 16..
Right, but this is because the state vector that gets used to create the jacobian is wrong in the first place. You can print the state from the from_array
method using lf.array
after calling ghostUpdate
, and you can see that there's duplicate values in there, which should be impossible.
I see, you're right. I'll have a look
I think the reason is this: https://petsc.org/main/manual/mat/#matrix-and-vector-layouts-and-storage-locations
PETSc vectors always have a contiguous range of vector entries stored on each MPI rank. The first rank has entries from 0 to rend1 - 1, the next rank has entries from rend1 to rend2 - 1, etc.
Well, then I guess we have to make our own "map" that maps between domain indices and petsc indices. Initially, you can just make an array that contains all global indices obtained by looping over the domains, and later this can be optimised so it only contains local and ghost nodes.
PETSc has a tool for this. I'll check it out.
A future problem to think about: when using PETSc with complex numbers, all PETSc vectors and matrices will be complex. I think this means that the numpy arrays created by Discretization
must also be complex.
I imagine we'll just have to handle the conversion in the PETSc interface.
All implemented tests now succeed on multiple cores (tested with 1, 2, 4). The tests include some trivial checks which helped me with the implementation, so I didn't delete them.
Can you rebase on top of current master? I think that will fix the tests.
Can you rebase on top of current master? I think that will fix the tests.
done, but still not passing
done, but still not passing
Some were fixed, but not all. Seems like I have to push the same thing to JaDaPy.
btw, it wouldn't be difficult to create an ubuntu docker image with petsc installed, if we want to test petsc in the CI
btw, it wouldn't be difficult to create an ubuntu docker image with petsc installed, if we want to test petsc in the CI
Yeah, I was gonna do that next, but feel free to give it a go. I didn't do it for Trilinos because it's basically impossible.
btw, it wouldn't be difficult to create an ubuntu docker image with petsc installed, if we want to test petsc in the CI
Yeah, I was gonna do that next, but feel free to give it a go. I didn't do it for Trilinos because it's basically impossible.
I made a first basic version here: https://github.com/dajuno/petsc-docker All tests pass in the container, PETSc tests also in parallel
Parallel implementation of the PETSc interface.
TODO: eigenvalue solver