Closed JiaweiZhuang closed 5 years ago
Nothing comes to mind for why the memory requirement is so large. Have you tried to isolate if it is from compilation of ESMF or MAPL specifically? If it is MAPL, this is something we could potentially bring up with GMAO.
Looks like it crashes even when compiling ESMF (haven't got to MAPL yet):
mpicxx -c -fPIC -O -DNDEBUG -fPIC -DESMF_LOWERCASE_SINGLEUNDERSCORE -m64 -mcmodel=small -pthread -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src/../include -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/build_config/Linux.gfortran.default -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Superstructure -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/stubs/pthread -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/include -DMPICH_IGNORE_CXX_SEEK -DESMF_NO_INTEGER_1_BYTE -DESMF_NO_INTEGER_2_BYTE -DESMF_MPIIO -DESMF_NO_OPENMP -DSx86_64_small=1 -DESMF_OS_Linux=1 -D__SDIR__='"src/Infrastructure/Mesh/src"' /tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src/ESMCI_FindPnts.C -o /tutorial/gchp_standard/CodeDir/GCHP/ESMF/obj/objO/Linux.gfortran.64.mpich2.default/ESMCI_FindPnts.o
mpicxx -c -fPIC -O -DNDEBUG -fPIC -DESMF_LOWERCASE_SINGLEUNDERSCORE -m64 -mcmodel=small -pthread -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src/../include -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/build_config/Linux.gfortran.default -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Superstructure -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/stubs/pthread -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/include -DMPICH_IGNORE_CXX_SEEK -DESMF_NO_INTEGER_1_BYTE -DESMF_NO_INTEGER_2_BYTE -DESMF_MPIIO -DESMF_NO_OPENMP -DSx86_64_small=1 -DESMF_OS_Linux=1 -D__SDIR__='"src/Infrastructure/Mesh/src"' /tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src/ESMCI_ConserveInterp.C -o /tutorial/gchp_standard/CodeDir/GCHP/ESMF/obj/objO/Linux.gfortran.64.mpich2.default/ESMCI_ConserveInterp.o
mpicxx -c -fPIC -O -DNDEBUG -fPIC -DESMF_LOWERCASE_SINGLEUNDERSCORE -m64 -mcmodel=small -pthread -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src/../include -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/build_config/Linux.gfortran.default -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Superstructure -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/stubs/pthread -I/tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/include -DMPICH_IGNORE_CXX_SEEK -DESMF_NO_INTEGER_1_BYTE -DESMF_NO_INTEGER_2_BYTE -DESMF_MPIIO -DESMF_NO_OPENMP -DSx86_64_small=1 -DESMF_OS_Linux=1 -D__SDIR__='"src/Infrastructure/Mesh/src"' /tutorial/gchp_standard/CodeDir/GCHP/ESMF/src/Infrastructure/Mesh/src/ESMCI_MeshCXX.C -o /tutorial/gchp_standard/CodeDir/GCHP/ESMF/obj/objO/Linux.gfortran.64.mpich2.default/ESMCI_MeshCXX.o
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
gmake[14]: *** [/tutorial/gchp_standard/CodeDir/GCHP/ESMF/obj/objO/Linux.gfortran.64.mpich2.default/ESMCI_HAdapt.o] Error 4
gmake[14]: *** Waiting for unfinished jobs....
g++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
virtual memory exhausted: Cannot allocate memory
virtual memory exhausted: Cannot allocate memory
We want to eventually use a pre-built ESMF which would solve this problem if the large RAM requirement is only from building ESMF. You can isolate the RAM needed to build everything except ESMF by compiling ESMF successfully so that you have file esmf.install) and then doing make compile_mapl.
Ha, it turns out that make compile_mapl
only needs 2GB RAM. Compiling ESMF needs 8GB RAM:
Full log files, tested on EC2 t2.micro, t2.small, t2.large:
So this problem should be solved by using pre-built ESMF.
Great. I will close this issue since we have a path forward, although it might take some time.
I am able to build GCHP Docker image on a large EC2 instance (>10 GB RAM), but fail to do so with automated build on Docker Hub because of the 2 GB RAM restrictions on Docker Hub
Here's the full build log: https://hub.docker.com/r/zhuangjw/gchp_model/builds/b4bvaupogcmwvy5dcc9nzdw/
Any idea why GCHP needs so large memory at compile time?
The workaround is to build Docker images locally (e.g. on AWS) and uploaded to Docker Hub.
Alternatively I can try building Docker images on TravisCI. Travis has 7.5 GB RAM and should probably work.