Closed baagaard-usgs closed 3 years ago
Getting the solution at a point results in NaN values. Valgrind reports invalid reads. The coefficients
array at line 215 in plexproject.c
from DMPlexVecGetClosure()
has the NaN values at some locations in the array. I don't know if this is a PETSc error or a problem setting up the PetscDM and PetscVec objects for output.
The error was the discretization for the domain was used for the boundary. We need to set the dimension of the discretization to the dimension of the boundary. The proper fix is to project from the domain to the desired basis order for the boundary.
make install
cd tests/fullscale/linearelasticity/nofaults-2d
valgrind --trace-children=yes --suppressions=/opt/pylith/src/pylith/share/valgrind-python.supp pylith gravity.cfg gravity_quad.cfg
==8564== Invalid read of size 8
==8564== at 0x87A9733: PetscSpaceEvaluate_Polynomial (spacepoly.c:217)
==8564== by 0x87B149A: PetscSpaceEvaluate (space.c:514)
==8564== by 0x88627DD: PetscFECreateTabulation_Basic (febasic.c:149)
==8564== by 0x8874FF0: PetscFECreateTabulation (fe.c:986)
==8564== by 0x8B62655: DMProjectLocal_Generic_Plex (plexproject.c:659)
==8564== by 0x8B64EFE: DMProjectFieldLocal_Plex (plexproject.c:869)
==8564== by 0x8C23D52: DMProjectFieldLocal (dm.c:8753)
==8564== by 0x8FDE671: DMProjectField (dmproject.c:237)
==8564== by 0xA410C28: pylith::meshio::FieldFilterProject::apply(_p_Vec**, _p_DM*, _p_Vec*) const (FieldFilterProject.cc:110)
==8564== by 0xA45F248: pylith::meshio::OutputSubfield::extract(_p_Vec* const&) (OutputSubfield.cc:124)
==8564== by 0xA477E19: pylith::meshio::OutputPhysics::_writeDataStep(double, int, pylith::topology::Field const&) (OutputPhysics.cc:342)
==8564== by 0xA473BAC: pylith::meshio::OutputPhysics::update(double, int, pylith::topology::Field const&, bool) (OutputPhysics.cc:195)
==8564== Address 0x186dac50 is 0 bytes after a block of size 16 alloc'd
==8564== at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==8564== by 0x7F93CD0: PetscMallocAlign (mal.c:54)
==8564== by 0x7F94E1A: PetscMallocA (mal.c:423)
==8564== by 0x8B5D52F: PetscDualSpaceGetAllPointsUnion (plexproject.c:445)
==8564== by 0x8B623EE: DMProjectLocal_Generic_Plex (plexproject.c:650)
==8564== by 0x8B64EFE: DMProjectFieldLocal_Plex (plexproject.c:869)
==8564== by 0x8C23D52: DMProjectFieldLocal (dm.c:8753)
==8564== by 0x8FDE671: DMProjectField (dmproject.c:237)
==8564== by 0xA410C28: pylith::meshio::FieldFilterProject::apply(_p_Vec**, _p_DM*, _p_Vec*) const (FieldFilterProject.cc:110)
==8564== by 0xA45F248: pylith::meshio::OutputSubfield::extract(_p_Vec* const&) (OutputSubfield.cc:124)
==8564== by 0xA477E19: pylith::meshio::OutputPhysics::_writeDataStep(double, int, pylith::topology::Field const&) (OutputPhysics.cc:342)
==8564== by 0xA473BAC: pylith::meshio::OutputPhysics::update(double, int, pylith::topology::Field const&, bool) (OutputPhysics.cc:195)
==8564==
==8564== Invalid read of size 8
==8564== at 0x887FFE8: PetscFEEvaluateFieldJets_Internal (fe.c:2051)
==8564== by 0x8B5F6A4: DMProjectPoint_Field_Private (plexproject.c:267)
==8564== by 0x8B5F6A4: DMProjectPoint_Private (plexproject.c:411)
==8564== by 0x8B641CC: DMProjectLocal_Generic_Plex (plexproject.c:794)
==8564== by 0x8B64EFE: DMProjectFieldLocal_Plex (plexproject.c:869)
==8564== by 0x8C23D52: DMProjectFieldLocal (dm.c:8753)
==8564== by 0x8FDE671: DMProjectField (dmproject.c:237)
==8564== by 0xA410C28: pylith::meshio::FieldFilterProject::apply(_p_Vec**, _p_DM*, _p_Vec*) const (FieldFilterProject.cc:110)
==8564== by 0xA45F248: pylith::meshio::OutputSubfield::extract(_p_Vec* const&) (OutputSubfield.cc:124)
==8564== by 0xA477E19: pylith::meshio::OutputPhysics::_writeDataStep(double, int, pylith::topology::Field const&) (OutputPhysics.cc:342)
==8564== by 0xA473BAC: pylith::meshio::OutputPhysics::update(double, int, pylith::topology::Field const&, bool) (OutputPhysics.cc:195)
==8564== by 0xA4BC8EE: pylith::problems::ObserversPhysics::notifyObservers(double, int, pylith::topology::Field const&, bool) (ObserversPhysics.cc:152)
==8564== by 0xA2DA4DB: pylith::feassemble::PhysicsImplementation::notifyObservers(double, int, pylith::topology::Field const&) (PhysicsImplementation.cc:95)
==8564== Address 0x18982270 is 48 bytes inside a block of size 168 alloc'd
==8564== at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==8564== by 0x7F93CD0: PetscMallocAlign (mal.c:54)
==8564== by 0x8BF3B7A: DMGetWorkArray (dm.c:1484)
==8564== by 0x89B285F: DMPlexGetTransitiveClosure (plex.c:2525)
==8564== by 0x89BF651: DMPlexGetCompressedClosure (plex.c:4792)
==8564== by 0x89C000E: DMPlexVecGetClosure (plex.c:4979)
==8564== by 0x8B5E8F8: DMProjectPoint_Field_Private (plexproject.c:215)
==8564== by 0x8B5E8F8: DMProjectPoint_Private (plexproject.c:411)
==8564== by 0x8B641CC: DMProjectLocal_Generic_Plex (plexproject.c:794)
==8564== by 0x8B64EFE: DMProjectFieldLocal_Plex (plexproject.c:869)
==8564== by 0x8C23D52: DMProjectFieldLocal (dm.c:8753)
==8564== by 0x8FDE671: DMProjectField (dmproject.c:237)
==8564== by 0xA410C28: pylith::meshio::FieldFilterProject::apply(_p_Vec**, _p_DM*, _p_Vec*) const (FieldFilterProject.cc:110)
==8564==
==8564== Invalid read of size 8
==8564== at 0x888001A: PetscFEEvaluateFieldJets_Internal (fe.c:2052)
==8564== by 0x8B5F6A4: DMProjectPoint_Field_Private (plexproject.c:267)
==8564== by 0x8B5F6A4: DMProjectPoint_Private (plexproject.c:411)
==8564== by 0x8B641CC: DMProjectLocal_Generic_Plex (plexproject.c:794)
==8564== by 0x8B64EFE: DMProjectFieldLocal_Plex (plexproject.c:869)
==8564== by 0x8C23D52: DMProjectFieldLocal (dm.c:8753)
==8564== by 0x8FDE671: DMProjectField (dmproject.c:237)
==8564== by 0xA410C28: pylith::meshio::FieldFilterProject::apply(_p_Vec**, _p_DM*, _p_Vec*) const (FieldFilterProject.cc:110)
==8564== by 0xA45F248: pylith::meshio::OutputSubfield::extract(_p_Vec* const&) (OutputSubfield.cc:124)
==8564== by 0xA477E19: pylith::meshio::OutputPhysics::_writeDataStep(double, int, pylith::topology::Field const&) (OutputPhysics.cc:342)
==8564== by 0xA473BAC: pylith::meshio::OutputPhysics::update(double, int, pylith::topology::Field const&, bool) (OutputPhysics.cc:195)
==8564== by 0xA4BC8EE: pylith::problems::ObserversPhysics::notifyObservers(double, int, pylith::topology::Field const&, bool) (ObserversPhysics.cc:152)
==8564== by 0xA2DA4DB: pylith::feassemble::PhysicsImplementation::notifyObservers(double, int, pylith::topology::Field const&) (PhysicsImplementation.cc:95)
==8564== Address 0x18982270 is 48 bytes inside a block of size 168 alloc'd
==8564== at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==8564== by 0x7F93CD0: PetscMallocAlign (mal.c:54)
==8564== by 0x8BF3B7A: DMGetWorkArray (dm.c:1484)
==8564== by 0x89B285F: DMPlexGetTransitiveClosure (plex.c:2525)
==8564== by 0x89BF651: DMPlexGetCompressedClosure (plex.c:4792)
==8564== by 0x89C000E: DMPlexVecGetClosure (plex.c:4979)
==8564== by 0x8B5E8F8: DMProjectPoint_Field_Private (plexproject.c:215)
==8564== by 0x8B5E8F8: DMProjectPoint_Private (plexproject.c:411)
==8564== by 0x8B641CC: DMProjectLocal_Generic_Plex (plexproject.c:794)
==8564== by 0x8B64EFE: DMProjectFieldLocal_Plex (plexproject.c:869)
==8564== by 0x8C23D52: DMProjectFieldLocal (dm.c:8753)
==8564== by 0x8FDE671: DMProjectField (dmproject.c:237)
==8564== by 0xA410C28: pylith::meshio::FieldFilterProject::apply(_p_Vec**, _p_DM*, _p_Vec*) const (FieldFilterProject.cc:110)
It looks like DMCreateDS()
on the output DM in the case of a fault leads to a double free of the DS for hybrid cells. My guess is that there is a copy of the hybrid DS in the output DM without incrementing the reference counter.
Delete DMCreateDS()
after DMGetOutputDM()
in Field::createOutputVector()
. DS already exists and DMCreateDS()
cannot be called multiple times.
make install -C libsrc
make check -C tests/mmstests/faults
pylith::mmstests::TestFaultKin2D_RigidBlocksStatic_TriP1::testDiscretization [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Corrupt argument: https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: Object already free: Parameter # 1 [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.15.0-487-g55766101d4d GIT Date: 2021-05-13 10:56:00 -0400 [0]PETSC ERROR: .libs/test_faultkin on a arch-clang-3.6.0-py38_debug named igskcicgltgm067 by baagaard Wed May 26 21:17:54 2021 [0]PETSC ERROR: Configure options --PETSC_ARCH=arch-clang-3.6.0-py38_debug --with-debugging=1 --with-clanguage=c --with-mpi-compilers=1 --with-shared-libraries=1 --with-64-bit-points=1 --with-large-file-io=1 --with-lgrind=0 --download-chaco=1 --download-parmetis=1 --download-metis=1 --download-triangle --with-blas-lib=/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libBLAS.dylib --with-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libLAPACK.dylib --download-ml=1 --download-superlu=1 --with-fc=0 --with-hdf5=1 --with-hdf5-include=/Users/baagaard/tools/unix/hdf5-1.12.0/clang-3.6.0/include --with-hdf5-lib=/Users/baagaard/tools/unix/hdf5-1.12.0/clang-3.6.0/lib/libhdf5.dylib --with-zlib=1 [0]PETSC ERROR: #1 PetscDSGetHybrid() at /Users/baagaard/tools/unix/petsc-dev/src/dm/dt/interface/dtds.c:732 [0]PETSC ERROR: #2 DMCreateDS() at /Users/baagaard/tools/unix/petsc-dev/src/dm/interface/dm.c:5658 [0]PETSC ERROR: #3 void pylith::topology::Field::createOutputVector()() at /Users/baagaard/src/cig/pylith/libsrc/pylith/topology/Field.cc:601 : error[0]PETSC ERROR: #4 PetscDSDestroy() at /Users/baagaard/tools/unix/petsc-dev/src/dm/dt/interface/dtds.c:540 [0]PETSC ERROR: #5 DMClearDS() at /Users/baagaard/tools/unix/petsc-dev/src/dm/interface/dm.c:5198 [0]PETSC ERROR: #6 DMDestroy() at /Users/baagaard/tools/unix/petsc-dev/src/dm/interface/dm.c:770 [0]PETSC ERROR: #7 DMDestroy() at /Users/baagaard/tools/unix/petsc-dev/src/dm/interface/dm.c:771 [0]PETSC ERROR: #8 void pylith::topology::Field::deallocate()() at /Users/baagaard/src/cig/pylith/libsrc/pylith/topology/Field.cc:149
Projecting the solution from Q2 on the boundary to Q1 on the boundary fails. P2 to P1 works, so we suspect this is specific to projection with Q2 and Q1. Note this involves DMProjectField()
.
PETSc branch: knepley/pylith PyLith branch: baagaard/feature-output-subfield
make install
cd tests/fullscale/linearelasticity/nofaults-2d
nemesis TestGravity.py
test_bcdirichlet_info (__main__.TestQuad) ... ok
test_bcdirichlet_solution (__main__.TestQuad) ... Error in component 1 of field 'displacement' at time step 0.
Total # not okay: 9
Ratio (maskR), # not okay: 8
Expected values (not okay): [-11.03248134 -11.20760009 -10.50712509 -9.63153133 -8.40570007
-6.82963131 -4.90332504 -2.62678127]
Computed values (not okay): [-11.1638204 -10.8135829 -10.1131079 -9.06239539 -7.66144538
-5.91025786 -3.80883284 -1.35717032]
Ratio (not okay): [0.01190476 0.03515625 0.0375 0.05909091 0.08854167 0.13461538
0.22321429 0.48333333]
Ratio Coordinates (not okay): [[-4000. 3000.]
[-4000. 4000.]
[-4000. 2000.]
[-4000. 1000.]
[-4000. 0.]
[-4000. -1000.]
[-4000. -2000.]
[-4000. -3000.]]
Tolerance Absolute Mask: 0.1
Ratio Tolerance: 1e-05
Relative Diff (maskD), # not okay: 1
Expected values (not okay): [0.]
Computed values (not okay): [-11.03248134]
Relative diff (not okay): [1.35483871]
Relative diff Coordinates (not okay): [[-4000. -4000.]]
Diff Tolerance: 0.000010
Scale: 8.1430e+00
FAIL
test_domain_solution (__main__.TestQuad) ... ok
test_material_info (__main__.TestQuad) ... ok
test_material_solution (__main__.TestQuad) ... ok
test_bcdirichlet_info (__main__.TestTri) ... ok
test_bcdirichlet_solution (__main__.TestTri) ... ok
test_domain_solution (__main__.TestTri) ... ok
test_material_info (__main__.TestTri) ... ok
test_material_solution (__main__.TestTri) ... ok
======================================================================
FAIL: test_bcdirichlet_solution (__main__.TestQuad)
----------------------------------------------------------------------
Traceback (most recent call last):
File "TestGravity.py", line 83, in test_bcdirichlet_solution
check_data(filename, self,
File "/Users/baagaard/tools/unix/pylith-develop/clang-3.6.0-py38/lib/python3.8/site-packages/pylith/testing/FullTestApp.py", line 262, in check_data
checker.checkVertexField(field)
File "/Users/baagaard/tools/unix/pylith-develop/clang-3.6.0-py38/lib/python3.8/site-packages/pylith/testing/FullTestApp.py", line 161, in checkVertexField
self._checkField(fieldName, fieldE, field, vertices, mask)
File "/Users/baagaard/tools/unix/pylith-develop/clang-3.6.0-py38/lib/python3.8/site-packages/pylith/testing/FullTestApp.py", line 239, in _checkField
self.testcase.assertEqual(npts, numpy.sum(okay))
AssertionError: 9 != 0
----------------------------------------------------------------------
Ran 10 tests in 8.293s
FAILED (failures=1)
Our current implementation of output (specifically, extracting a subfield from a field with many subfields) results in accumulation of memory usage.
The appropriate solution is to clean up the output implementation.
DMCreateSubDM()
to create aPetscDM
and aPetscIS
for the subfield we want to extract.VecISCopy()
to get a copy of the vector for the subfield. By getting a copy we can dimensionalize it, project it, etc.