yang5891 commented 6 months ago

Hello, authors.I need to run this software on centos, but I'm having some problems with it and I'm asking you guys for advice.

Here I made some changes to the makefile. `ifeq ($(shell cat /etc/os-release | grep -qEi 'centos'; echo $$?),0) # Centos

ifeq ($(UNAME_R),18.7.0)

include makeopts.centos
$(info System identified as centos)
SYSTEM_IDENTIFIED = 1

endif

endif corresponding makeopts.centos isinclude compileropts.gnu

FC = mpif90

pgplot

LIBS += -L/usr/lib/ -lpgplot

netcdf

NETCDF_DIR = /public/home/zhangzl/software/netcdf4-needed/lib\ LIBS += $(NETCDF_DIR)/libnetcdff.a -L $(NETCDF_DIR) -lnetcdf INCLUDE_DIRS += -I /public/home/zhangzl/software/netcdf4-needed/include

scalapack

LIBS += -L /public/home/zhangzl/software/scalapack-2.2.0 \ /public/home/zhangzl/software/BLAS-3.12.0/lib/libblas.a \ /public/home/zhangzl/software/BLACS/LIB/blacs.a \ /public/home/zhangzl/software/BLACS/LIB/blacsF77init_MPI-LINUX-0.a \ /public/home/zhangzl/software/BLACS/LIB/blacsCinit_MPI-LINUX-0.a \ /public/home/zhangzl/software/BLACS/LIB/blacsF77.a \ /public/home/zhangzl/software/BLACS/LIB/blacs_MPI-LINUX-0.a `

The compileropts.centos is COMMON_OPTION = -save -r8 #-i8 COMMON_OPTION2 = -r8 #-i8 COMMON_OPTION3 = COMMON_OPTION4 = -r8 #-i4 MOD_DIR_FLAG = -module $(MOD_DIR)

Now when I do the make operation, I run into some problems ①(.text+0x20): undefined reference tomain' obj/ql_myra.o: In function __ql_myra_mod_MOD_ql_myra_write': ql_myra.f:(.text+0x11744): undefined reference toblacsbarrier' obj/wdot_test.o: In function __wdot_mod_MOD_wdot_new_maxwellian': wdot_test.f90:(.text+0xa6c0): undefined reference toblacsbarrier' obj/current.o: In function current_orbit_': current.f:(.text+0x17c4f): undefined reference tofftn2_' current.f:(.text+0x245c2): undefined reference to blacs_barrier_' obj/current.o: In functionntilda': current.f:(.text+0x2b22e): undefined reference to blacs_barrier_' obj/current.o: In functioncurrent_': current.f:(.text+0x33792): undefined reference to blacs_barrier_' obj/current.o: In functioncurrent1': current.f:(.text+0x3d416): undefined reference to blacs_barrier_' obj/current.o: In functioncurrent2': current.f:(.text+0x459f2): undefined reference to blacs_barrier_' obj/current.o:current.f:(.text+0x49f11): more undefined references toblacsbar rier' follow obj/setupblacs.o: In function setupblacs_': setupblacs.f:(.text+0x5f): undefined reference toblacspinfo' setupblacs.f:(.text+0x8a): undefined reference to blacs_setup_' setupblacs.f:(.text+0xa5): undefined reference toblacsget' setupblacs.f:(.text+0xf9): undefined reference to blacs_gridinit_' setupblacs.f:(.text+0x38d): undefined reference toblacsgridexit' setupblacs.f:(.text+0x4fd): undefined reference to blacs_get_' setupblacs.f:(.text+0x521): undefined reference toblacsgridinit' setupblacs.f:(.text+0x545): undefined reference to blacs_gridinfo_' obj/rf2x_setup2.o: In functionrunrf2x': rf2xsetup2.f:(.text+0x3f25): undefined reference to `rhograte' rf2xsetup2.f:(.text+0x3f6a): undefined reference to `rhograte' rf2xsetup2.f:(.text+0x3faf): undefined reference to `rhograte' rf2xsetup2.f:(.text+0x3ff4): undefined reference to `rhograte' rf2xsetup2.f:(.text+0x4039): undefined reference to `rhograte' obj/rf2x_setup2.o:rf2xsetup2.f:(.text+0x407e): more undefined references to `rh ograte' follow obj/read_cql3d.o: In function __read_cql3d_MOD_netcdfr3d': read_cql3d.f90:(.text+0x205): undefined reference tonetcdf_MOD_nf90_open' read_cql3d.f90:(.text+0x347): undefined reference to `netcdf_MOD_nf90_inq_dimi d' read_cql3d.f90:(.text+0x411): undefined reference to `netcdf_MOD_nf90_inq_dimi d' read_cql3d.f90:(.text+0x4db): undefined reference to __netcdf_MOD_nf90_inq_dimi d' read_cql3d.f90:(.text+0x5a5): undefined reference tonetcdf_MOD_nf90_inq_dimi d' read_cql3d.f90:(.text+0x66f): undefined reference to __netcdf_MOD_nf90_inq_dimi d' read_cql3d.f90:(.text+0x73f): undefined reference tonetcdf_MOD_nf90inquire dimension' readcql3d.f90:(.text+0x81a): undefined reference to `ncdinq' readcql3d.f90:(.text+0x844): undefined reference to `ncdinq' readcql3d.f90:(.text+0x86e): undefined reference to `ncdinq' readcql3d.f90:(.text+0x898): undefined reference to `ncdinq' read_cql3d.f90:(.text+0x19cc): undefined reference to __netcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x19eb): undefined reference tonetcdf_MOD_nf90_get_var _eightbytereal' read_cql3d.f90:(.text+0x1a83): undefined reference to `netcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x1ab6): undefined reference to __netcdf_MOD_nf90_get_var _1d_eightbytereal' read_cql3d.f90:(.text+0x1ad5): undefined reference tonetcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x1b08): undefined reference to __netcdf_MOD_nf90_get_var _1d_fourbyteint' read_cql3d.f90:(.text+0x1b3f): undefined reference tonetcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x1b72): undefined reference to __netcdf_MOD_nf90_get_var _2d_eightbytereal' read_cql3d.f90:(.text+0x1b91): undefined reference tonetcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x1bc4): undefined reference to __netcdf_MOD_nf90_get_var _1d_eightbytereal' read_cql3d.f90:(.text+0x1be3): undefined reference tonetcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x1c16): undefined reference to __netcdf_MOD_nf90_get_var _3d_eightbytereal' read_cql3d.f90:(.text+0x1d76): undefined reference tonetcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x1da9): undefined reference to __netcdf_MOD_nf90_get_var _2d_eightbytereal' read_cql3d.f90:(.text+0x215b): undefined reference tonetcdf_MOD_nf90_inq_var id' read_cql3d.f90:(.text+0x218e): undefined reference to __netcdf_MOD_nf90_get_var _2d_eightbytereal' read_cql3d.f90:(.text+0x23f0): undefined reference to__netcdf_MOD_nf90_close' collect2: error: ld returned 1 exit status make: *** [xaorsa2d] Error 1 `

② If I compile with mpifort, the error is reported as src/CQL3D_SETUP/read_cql3d.f90(35): error #7013: This module file was not generated by anyrelease of this compiler.[NETCDF] use netcdf ------∧ src/COL3D_SETUP/read cql3d.f90(432):internal error:Please visit 'http://www.intel.com/sotware/products/support' for assistance. if ( iret .ne. NF90_NOERR) then [Aborting due to internal error. ] compilation aborted for src/CQL3D_SETUP/read cql3d.f90(code 1) make: ***[obi/read cql3d.o]Error 1

jcwright77 commented 6 months ago

Can you share on of the compile commands from the output. It seems that your libraries are not being found during the link phase.

Using mpifort will require "-mkl=cluster" for scalapack and blas support and intel built netcdff library and module file.

yang5891 commented 5 months ago

Can you share on of the compile commands from the output. It seems that your libraries are not being found during the link phase.

Using mpifort will require "-mkl=cluster" for scalapack and blas support and intel built netcdff library and module file.

Hello teacher. The problems encountered with make have now been reduced, but there are a few libraries that are not recognized.

/public/software/compiler/intel-compiler/2021.3.0/bin/intel64/../../compiler/lib/intel64_lin/for_main.o: In function main': for_main.c:(.text+0x2e): undefined reference toMAIN__' obj/current.o: In function current_orbit_': current.f:(.text+0x1b0a): undefined reference tofftn2_' obj/rf2x_setup2.o: In function run_rf2x_': rf2x_setup2.f:(.text+0x15c3): undefined reference torhograte_' rf2xsetup2.f:(.text+0x15f0): undefined reference to `rhograte' rf2xsetup2.f:(.text+0x161d): undefined reference to `rhograte' rf2xsetup2.f:(.text+0x164a): undefined reference to `rhograte' rf2xsetup2.f:(.text+0x1677): undefined reference to `rhograte' obj/rf2x_setup2.o:rf2xsetup2.f:(.text+0x16a4): more undefined references to `rhograte' follow make: *** [xaorsa2d] Error 1

jcwright77 commented 5 months ago

Those are all routines in the main program. Try

make clean ; make

and post the compile command for aorsa2dmain.o

On 2024-05-30 06:50, yang5891 wrote:

Can you share on of the compile commands from the output. It seems that your libraries are not being found during the link phase.

Using mpifort will require "-mkl=cluster" for scalapack and blas support and intel built netcdff library and module file.

Hello teacher. The problems encountered with make have now been reduced, but there are a few libraries that are not recognized.

/public/software/compiler/intel-compiler/2021.3.0/bin/intel64/../../compiler/lib/intel64_lin/for_main.o: In function main': for_main.c:(.text+0x2e): undefined reference to MAIN__' obj/current.o: In function currentorbit': current.f:(.text+0x1b0a): undefined reference to fftn2_' obj/rf2x_setup2.o: In function runrf2x': rf2xsetup2.f:(.text+0x15c3): undefined reference to rhograte' rf2xsetup2.f:(.text+0x15f0): undefined reference to rhograte' rf2xsetup2.f:(.text+0x161d): undefined reference to rhograte' rf2xsetup2.f:(.text+0x164a): undefined reference to rhograte' rf2xsetup2.f:(.text+0x1677): undefined reference to rhograte' obj/rf2x_setup2.o:rf2xsetup2.f:(.text+0x16a4): more undefined references to `rhograte' follow make: *** [xaorsa2d] Error 1

-- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. You are receiving this because you commented.Message ID: @.***>

-- -john Principal Research Scientist John Wright Office 617-253-9612 zoom: https://mit.zoom.us/my/jcwright

Links:

[1] https://github.com/ORNL-Fusion/aorsa/issues/49#issuecomment-2139288625 [2] https://github.com/notifications/unsubscribe-auth/AB7SLTPE3LGRXOMIWPPUBX3ZE4ABFAVCNFSM6AAAAABILFD6TWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZZGI4DQNRSGU

yang5891 commented 5 months ago

Those are all routines in the main program. Try make clean ; make and post the compile command for aorsa2dmain.o … On 2024-05-30 06:50, yang5891 wrote: > Can you share on of the compile commands from the output. It seems > that your libraries are not being found during the link phase. > > Using mpifort will require "-mkl=cluster" for scalapack and blas > support and intel built netcdff library and module file. Hello teacher. The problems encountered with make have now been reduced, but there are a few libraries that are not recognized. /public/software/compiler/intel-compiler/2021.3.0/bin/intel64/../../compiler/lib/intel64_lin/for_main.o: In function main': for_main.c:(.text+0x2e): undefined reference to MAIN__' obj/current.o: In function currentorbit': current.f:(.text+0x1b0a): undefined reference to fftn2_' obj/rf2x_setup2.o: In function runrf2x': rf2xsetup2.f:(.text+0x15c3): undefined reference to rhograte' rf2xsetup2.f:(.text+0x15f0): undefined reference to rhograte' rf2xsetup2.f:(.text+0x161d): undefined reference to rhograte' rf2xsetup2.f:(.text+0x164a): undefined reference to rhograte' rf2xsetup2.f:(.text+0x1677): undefined reference to rhograte' obj/rf2x_setup2.o:rf2xsetup2.f:(.text+0x16a4): more undefined references to `rhograte' follow make: [xaorsa2d] Error 1 -- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. You are receiving this because you commented.Message ID: @.> -- -john Principal Research Scientist John Wright Office 617-253-9612 zoom: https://mit.zoom.us/my/jcwright Links: ------ [1] #49 (comment) [2] https://github.com/notifications/unsubscribe-auth/AB7SLTPE3LGRXOMIWPPUBX3ZE4ABFAVCNFSM6AAAAABILFD6TWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZZGI4DQNRSGU

Thank you teacher, I can now compile and generate xaorsa2d files. Since I don't need pgplot, I commented out the relevant statements. But now I get an error when executing [xaorsa2d.]

`forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source xaorsa2d 00000000008181DA forsignal_handl Unknown Unknown libpthread-2.17.s 00002B448DA74630 Unknown Unknown Unknown libmpi.so.20.10.2 00002B448D49EAC5 MPI_Comm_size Unknown Unknown libmkl_blacs_inte 00002B44864E7A39 MKLMPI_Comm_size Unknown Unknown libmkl_blacs_inte 00002B44864E5D31 mkl_blacs_init Unknown Unknown libmkl_blacs_inte 00002B44864D6898 blacs_pinfo Unknown Unknown xaorsa2d 00000000005FB088 Unknown Unknown Unknown xaorsa2d 0000000000412E92 Unknown Unknown Unknown libc-2.17.so 00002B448DCA3555 __libc_start_main Unknown Unknown xaorsa2d 0000000000412DA9 Unknown Unknown Unknown

Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55252,1],0] Exit code: 174

`

jcwright77 commented 5 months ago

Your libraries aren’t going runtime. Is your ld_library_path set ?-johnOn May 31, 2024, at 5:01 AM, yang5891 @.***> wrote:

Those are all routines in the main program. Try make clean ; make and post the compile command for aorsa2dmain.o … On 2024-05-30 06:50, yang5891 wrote: > Can you share on of the compile commands from the output. It seems > that your libraries are not being found during the link phase. > > Using mpifort will require "-mkl=cluster" for scalapack and blas > support and intel built netcdff library and module file. Hello teacher. The problems encountered with make have now been reduced, but there are a few libraries that are not recognized. /public/software/compiler/intel-compiler/2021.3.0/bin/intel64/../../compiler/lib/intel64_lin/for_main.o: In function main': for_main.c:(.text+0x2e): undefined reference to MAIN__' obj/current.o: In function currentorbit': current.f:(.text+0x1b0a): undefined reference to fftn2_' obj/rf2x_setup2.o: In function runrf2x': rf2xsetup2.f:(.text+0x15c3): undefined reference to rhograte' rf2xsetup2.f:(.text+0x15f0): undefined reference to rhograte' rf2xsetup2.f:(.text+0x161d): undefined reference to rhograte' rf2xsetup2.f:(.text+0x164a): undefined reference to rhograte' rf2xsetup2.f:(.text+0x1677): undefined reference to rhograte' obj/rf2x_setup2.o:rf2xsetup2.f:(.text+0x16a4): more undefined references to `rhograte' follow make: [xaorsa2d] Error 1 -- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. You are receiving this because you commented.Message ID: @.> -- -john Principal Research Scientist John Wright Office 617-253-9612 zoom: https://mit.zoom.us/my/jcwright Links: ------ [1] #49 (comment) [2] https://github.com/notifications/unsubscribe-auth/AB7SLTPE3LGRXOMIWPPUBX3ZE4ABFAVCNFSM6AAAAABILFD6TWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZZGI4DQNRSGU

Thank you teacher, I can now compile and generate xaorsa2d files. Since I don't need pgplot, I commented out the relevant statements. But now I get an error when executing [xaorsa2d.] `forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source xaorsa2d 00000000008181DA forsignal_handl Unknown Unknown libpthread-2.17.s 00002B448DA74630 Unknown Unknown Unknown libmpi.so.20.10.2 00002B448D49EAC5 MPI_Comm_size Unknown Unknown libmkl_blacs_inte 00002B44864E7A39 MKLMPI_Comm_size Unknown Unknown libmkl_blacs_inte 00002B44864E5D31 mkl_blacs_init Unknown Unknown libmkl_blacs_inte 00002B44864D6898 blacs_pinfo Unknown Unknown xaorsa2d 00000000005FB088 Unknown Unknown Unknown xaorsa2d 0000000000412E92 Unknown Unknown Unknown libc-2.17.so 00002B448DCA3555 __libc_start_main Unknown Unknown xaorsa2d 0000000000412DA9 Unknown Unknown Unknown Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55252,1],0] Exit code: 174 `

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

yang5891 commented 5 months ago

Excuse me sir, I am using openmpi's intel-4.0.3 compiler here. The mkl library used is intel-2021.3.0 as shown in the file. it is possible to compile and generate xaorsa2d, but the execution reports an error.

---- Replied Message ---- | From | John C. @.> | | Date | 5/31/2024 21:08 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

Message ID: @.***>

jcwright77 commented 5 months ago

You will have to share the error message -johnOn Jun 2, 2024, at 10:04 AM, yang5891 @.***> wrote: Excuse me sir, I am using openmpi's intel-4.0.3 compiler here. The mkl library used is intel-2021.3.0 as shown in the file. it is possible to compile and generate xaorsa2d, but the execution reports an error.

---- Replied Message ---- | From | John C. @.> | | Date | 5/31/2024 21:08 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

yang5891 commented 5 months ago

Teacher, this is the problem I'm having when running xaorsa2d.

Caught signal 11 (Segmentation fault: address not mapped to object at addres s 0x440000f8) ==== backtrace (tid: 11614) ==== 0 0x000000000006e600 opal_mutex_unlock() /tmp/clussoft.20240516134350/openmpi-4.0.3/ompi/mpi/c/profi le/../../../../opal/threads/mutex_unix.h:158 1 0x000000000006e600 PMPI_Comm_size() /tmp/clussoft.20240516134350/openmpi-4.0.3/ompi/mpi/c/profile/ pcomm_size.c:63 2 0x0000000000029a39 MKLMPI_Comm_size() ???:0 3 0x0000000000027d31 mkl_blacs_init() ???:0 4 0x0000000000018898 blacspinfo() ???:0 5 0x0000000000605248 MAIN() ???:0 6 0x0000000000414b62 main() ???:0 7 0x0000000000022555 libc_start_main() ???:0 8 0x0000000000414a69 _start() ???:0

forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source xaorsa2d 000000000083F74A forsignal_handl Unknown Unknown libpthread-2.17.s 00007F481CAAE630 Unknown Unknown Unknown libmpi.so.40.20.3 00007F481D02B600 MPI_Comm_size Unknown Unknown libmkl_blacs_inte 00007F482403DA39 MKLMPI_Comm_size Unknown Unknown libmkl_blacs_inte 00007F482403BD31 mkl_blacs_init Unknown Unknown libmkl_blacs_inte 00007F482402C898 blacs_pinfo Unknown Unknown xaorsa2d 0000000000605248 Unknown Unknown Unknown xaorsa2d 0000000000414B62 Unknown Unknown Unknown libc-2.17.so 00007F481C6F3555 __libc_start_main Unknown Unknown xaorsa2d 0000000000414A69 Unknown Unknown Unknown

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55098,1],0] Exit code: 174

---- Replied Message ---- | From | John C. @.> | | Date | 6/2/2024 22:22 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | You will have to share the error message -johnOn Jun 2, 2024, at 10:04 AM, yang5891 @.***> wrote: Excuse me sir, I am using openmpi's intel-4.0.3 compiler here. The mkl library used is intel-2021.3.0 as shown in the file. it is possible to compile and generate xaorsa2d, but the execution reports an error.

---- Replied Message ---- | From | John C. @.> | | Date | 5/31/2024 21:08 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jcwright77 commented 5 months ago

What’s your mpirun command? What doesmpirun hostnameReturn?Which test case are you running ? They need one core. -johnOn Jun 2, 2024, at 10:40 AM, yang5891 @.***> wrote: Teacher, this is the problem I'm having when running xaorsa2d.

Caught signal 11 (Segmentation fault: address not mapped to object at addres s 0x440000f8) ==== backtrace (tid: 11614) ==== 0 0x000000000006e600 opal_mutex_unlock() /tmp/clussoft.20240516134350/openmpi-4.0.3/ompi/mpi/c/profi le/../../../../opal/threads/mutex_unix.h:158 1 0x000000000006e600 PMPI_Comm_size() /tmp/clussoft.20240516134350/openmpi-4.0.3/ompi/mpi/c/profile/ pcomm_size.c:63 2 0x0000000000029a39 MKLMPI_Comm_size() ???:0 3 0x0000000000027d31 mkl_blacs_init() ???:0 4 0x0000000000018898 blacspinfo() ???:0 5 0x0000000000605248 MAIN() ???:0 6 0x0000000000414b62 main() ???:0 7 0x0000000000022555 libc_start_main() ???:0 8 0x0000000000414a69 _start() ???:0

forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source xaorsa2d 000000000083F74A forsignal_handl Unknown Unknown libpthread-2.17.s 00007F481CAAE630 Unknown Unknown Unknown libmpi.so.40.20.3 00007F481D02B600 MPI_Comm_size Unknown Unknown libmkl_blacs_inte 00007F482403DA39 MKLMPI_Comm_size Unknown Unknown libmkl_blacs_inte 00007F482403BD31 mkl_blacs_init Unknown Unknown libmkl_blacs_inte 00007F482402C898 blacs_pinfo Unknown Unknown xaorsa2d 0000000000605248 Unknown Unknown Unknown xaorsa2d 0000000000414B62 Unknown Unknown Unknown libc-2.17.so 00007F481C6F3555 __libc_start_main Unknown Unknown xaorsa2d 0000000000414A69 Unknown Unknown Unknown

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55098,1],0] Exit code: 174

---- Replied Message ---- | From | John C. @.> | | Date | 6/2/2024 22:22 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | You will have to share the error message -johnOn Jun 2, 2024, at 10:04 AM, yang5891 @.***> wrote: Excuse me sir, I am using openmpi's intel-4.0.3 compiler here. The mkl library used is intel-2021.3.0 as shown in the file. it is possible to compile and generate xaorsa2d, but the execution reports an error.

---- Replied Message ---- | From | John C. @.> | | Date | 5/31/2024 21:08 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

yang5891 commented 5 months ago

What’s your mpirun command? What doesmpirun hostnameReturn?Which test case are you running ? They need one core. -johnOn Jun 2, 2024, at 10:40 AM, yang5891 @.> wrote: Teacher, this is the problem I'm having when running xaorsa2d. Caught signal 11 (Segmentation fault: address not mapped to object at addres s 0x440000f8) ==== backtrace (tid: 11614) ==== 0 0x000000000006e600 opal_mutex_unlock() /tmp/clussoft.20240516134350/openmpi-4.0.3/ompi/mpi/c/profi le/../../../../opal/threads/mutex_unix.h:158 1 0x000000000006e600 PMPI_Comm_size() /tmp/clussoft.20240516134350/openmpi-4.0.3/ompi/mpi/c/profile/ pcomm_size.c:63 2 0x0000000000029a39 MKLMPI_Comm_size() ???:0 3 0x0000000000027d31 mkl_blacs_init() ???:0 4 0x0000000000018898 blacspinfo() ???:0 5 0x0000000000605248 MAIN() ???:0 6 0x0000000000414b62 main() ???:0 7 0x0000000000022555 libc_start_main() ???:0 8 0x0000000000414a69 _start() ???:0 ================================= forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source xaorsa2d 000000000083F74A forsignal_handl Unknown Unknown libpthread-2.17.s 00007F481CAAE630 Unknown Unknown Unknown libmpi.so.40.20.3 00007F481D02B600 MPI_Comm_size Unknown Unknown libmkl_blacs_inte 00007F482403DA39 MKLMPI_Comm_size Unknown Unknown libmkl_blacs_inte 00007F482403BD31 mkl_blacs_init Unknown Unknown libmkl_blacs_inte 00007F482402C898 blacs_pinfo Unknown Unknown xaorsa2d 0000000000605248 Unknown Unknown Unknown xaorsa2d 0000000000414B62 Unknown Unknown Unknown libc-2.17.so 00007F481C6F3555 __libc_start_main Unknown Unknown xaorsa2d 0000000000414A69 Unknown Unknown Unknown -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55098,1],0] Exit code: 174 -------------------------------------------------------------------------- … ---- Replied Message ---- | From | John C. @.> | | Date | 6/2/2024 22:22 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | You will have to share the error message -johnOn Jun 2, 2024, at 10:04 AM, yang5891 @.> wrote: Excuse me sir, I am using openmpi's intel-4.0.3 compiler here. The mkl library used is intel-2021.3.0 as shown in the file. it is possible to compile and generate xaorsa2d, but the execution reports an error. ---- Replied Message ---- | From | John C. @.> | | Date | 5/31/2024 21:08 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | Message ID: @.> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.> — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.>

Teacher, I can run the program now with a single core. But when I use multi-core, I get an error. Can this program only be run on a single core?

jcwright77 commented 5 months ago

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things,
verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

yang5891 commented 5 months ago

Thank you, teacher, for your patience these days. My brother in my research group helped me find the relevant parameters for calculating the number of cores (nprow x npcol = nproc).But is that okay as long as the product of the two is equal to the number of cores needed for the calculation? For example, if I use quad-core computing, is there a difference between 2x2 and 1x4?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 00:48 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things, verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jcwright77 commented 5 months ago

It affects the decomposition of the matrix in the code. 2 x 2 is better than one by four. For the test cases you should not need to change anything. They just use one by one.-johnOn Jun 4, 2024, at 7:52 AM, yang5891 @.***> wrote: Thank you, teacher, for your patience these days. My brother in my research group helped me find the relevant parameters for calculating the number of cores (nprow x npcol = nproc).But is that okay as long as the product of the two is equal to the number of cores needed for the calculation? For example, if I use quad-core computing, is there a difference between 2x2 and 1x4?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 00:48 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things, verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

yang5891 commented 5 months ago

Thank you, teacher, but the number of cores in our group is 32, and I need to set more points and consider different ions and concentrations in the future. How should I set these two parameters?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 19:54 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | It affects the decomposition of the matrix in the code. 2 x 2 is better than one by four. For the test cases you should not need to change anything. They just use one by one.-johnOn Jun 4, 2024, at 7:52 AM, yang5891 @.***> wrote: Thank you, teacher, for your patience these days. My brother in my research group helped me find the relevant parameters for calculating the number of cores (nprow x npcol = nproc).But is that okay as long as the product of the two is equal to the number of cores needed for the calculation? For example, if I use quad-core computing, is there a difference between 2x2 and 1x4?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 00:48 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things, verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jcwright77 commented 5 months ago

First you need to verify the test case works so far you show me that you get errors for that-johnOn Jun 4, 2024, at 8:00 AM, yang5891 @.***> wrote: Thank you, teacher, but the number of cores in our group is 32, and I need to set more points and consider different ions and concentrations in the future. How should I set these two parameters?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 19:54 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | It affects the decomposition of the matrix in the code. 2 x 2 is better than one by four. For the test cases you should not need to change anything. They just use one by one.-johnOn Jun 4, 2024, at 7:52 AM, yang5891 @.***> wrote: Thank you, teacher, for your patience these days. My brother in my research group helped me find the relevant parameters for calculating the number of cores (nprow x npcol = nproc).But is that okay as long as the product of the two is equal to the number of cores needed for the calculation? For example, if I use quad-core computing, is there a difference between 2x2 and 1x4?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 00:48 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things, verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

yang5891 commented 5 months ago

Yes teacher. I can now run every example normally. For example, the final output of DIIID-helion is as follows（Here I used nprow=4, npcol=8）

time to do plots = 0.039 min 1 total cpu time used = 0.097 min

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 21:16 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | First you need to verify the test case works so far you show me that you get errors for that-johnOn Jun 4, 2024, at 8:00 AM, yang5891 @.***> wrote: Thank you, teacher, but the number of cores in our group is 32, and I need to set more points and consider different ions and concentrations in the future. How should I set these two parameters?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 19:54 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | It affects the decomposition of the matrix in the code. 2 x 2 is better than one by four. For the test cases you should not need to change anything. They just use one by one.-johnOn Jun 4, 2024, at 7:52 AM, yang5891 @.***> wrote: Thank you, teacher, for your patience these days. My brother in my research group helped me find the relevant parameters for calculating the number of cores (nprow x npcol = nproc).But is that okay as long as the product of the two is equal to the number of cores needed for the calculation? For example, if I use quad-core computing, is there a difference between 2x2 and 1x4?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 00:48 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things, verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

jcwright77 commented 5 months ago

you can look at the namelist md file and the tests for guidance. For problem size set nprow=npcol=8 or greater. If you only have 32 cores, use 4,4

ideally nmodesx=nmodesy=128 but that requires several nodes and significant meory. You might try 32,32 but the case will be severely under resolved depending on the scales involved.

good luck

On 2024-06-04 09:43, yang5891 wrote:

Yes teacher. I can now run every example normally. For example, the final output of DIIID-helion is as follows（Here I used nprow=4, npcol=8）

time to do plots = 0.039 min 1 total cpu time used = 0.097 min

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 21:16 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | First you need to verify the test case works so far you show me that you get errors for that-johnOn Jun 4, 2024, at 8:00 AM, yang5891 @.***> wrote: Thank you, teacher, but the number of cores in our group is 32, and I need to set more points and consider different ions and concentrations in the future. How should I set these two parameters?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 19:54 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | It affects the decomposition of the matrix in the code. 2 x 2 is better than one by four. For the test cases you should not need to change anything. They just use one by one.-johnOn Jun 4, 2024, at 7:52 AM, yang5891 @.***> wrote: Thank you, teacher, for your patience these days. My brother in my research group helped me find the relevant parameters for calculating the number of cores (nprow x npcol = nproc).But is that okay as long as the product of the two is equal to the number of cores needed for the calculation? For example, if I use quad-core computing, is there a difference between 2x2 and 1x4?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 00:48 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things, verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

-- Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

--Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

-- Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

--Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

-- Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

-- Reply to this email directly, view it on GitHub [1], or unsubscribe [2]. You are receiving this because you commented.Message ID: @.***>

-- -john Principal Research Scientist John Wright Office 617-253-9612 zoom: https://mit.zoom.us/my/jcwright

Links:

[1] https://github.com/ORNL-Fusion/aorsa/issues/49#issuecomment-2147574133 [2] https://github.com/notifications/unsubscribe-auth/AB7SLTL4QXX2DP5W5HN3KEDZFXAAXAVCNFSM6AAAAABILFD6TWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBXGU3TIMJTGM --=_bdd64717b3167e2d2780a239b48ebf89 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8

you can look at the namelist md file and the tests for guidance. For pro= blem size set nprow=3Dnpcol=3D8 or greater. If you only have 32 cores, use = 4,4

ideally nmodesx=3Dnmodesy=3D128 but that requires several nodes and sign= ificant meory. You might try 32,32 but the case will be severely under reso= lved depending on the scales involved.

good luck

On 2024-06-04 09:43, yang5891 wrote:

Yes teacher. I can now run every example normally. For example, the final o= utput of DIIID-helion is as follows=EF=BC=88Here I used nprow=3D4, npcol=3D= 8=EF=BC=89

time to do plots =3D 0.039 min
1
total cpu time used =3D 0.097 min

----= Replied Message ----
| From | John C. ***@***.***> |
| Date |= 6/4/2024 21:16 |
| To | ***@***.***> |
| Cc | ***@***.***>= ,
***@***.***> |
| Subject | Re: [ORNL-Fusion/aorsa] How to ru= n properly on Centos server (Issue #49) |
First you need to verify the= test case works so far you show me that you get errors for that-johnOn Jun= 4, 2024, at 8:00=E2=80=AFAM, yang5891 ***@***.***> wrote:
Thank yo= u, teacher, but the number of cores in our group is 32, and I need to set m= ore points and consider different ions and concentrations in the future. Ho= w should I set these two parameters?

---- Replied Message ----
| From | John C. ***@***.***> |
|= Date | 6/4/2024 19:54 |
| To | ***@***.***> |
| Cc | ***@***.= ***>,
***@***.***> |
| Subject | Re: [ORNL-Fusion/aorsa] Ho= w to run properly on Centos server (Issue #49) |
It affects the decomp= osition of the matrix in the code. 2 x 2 is better than one by four. For th= e test cases you should not need to change anything. They just use one by o= ne.-johnOn Jun 4, 2024, at 7:52=E2=80=AFAM, yang5891 ***@***.***> wrote:=
Thank you, teacher, for your patience these days. My brother in my re= search group helped me find the relevant parameters for calculating the num= ber of cores (nprow x npcol =3D nproc).But is that okay as long as the prod= uct of the two is equal to the number of cores needed for the calculation? = For example, if I use quad-core computing, is there a difference between 2x= 2 and 1x4?

---- Replied Message --= --
| From | John C. ***@***.***> |
| Date | 6/4/2024 00:48 || To | ***@***.***> |
| Cc | ***@***.***>,
***@***.***&= gt; |
| Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Cento= s server (Issue #49) |

It seems like you are having issues runni= ng with mpi. This error has to do with starting up and executing under mpi = run, not with aorsa itself. I suggest you verify you can compile and run a = simple mpi program. Assuming you are using intel compile, grab cpi.c from h= ttps://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287
mpiicc cpi.c -o cpi
mpirun -np 4 ./cpi

other things,
verify you are using the correct mpirun, 'which mpirun' should show mpiru= n in the intel distribution
Try making from scratch:
make clean; = make

Ask a college who is familiar with parallel programs on you= r system for help. You issues seem to be outside of aorsa and have to do wi= th basic compilation and execution of parallel programs.

—=
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***= =2E***>

—Reply to this email directly, view it on GitHu= b, or unsubscribe.You are receiving this because you commented.Message ID: = ***@***.***>

—
Reply to this email directly, view = it on GitHub, or unsubscribe.
You are receiving this because you autho= red the thread.Message ID: ***@***.***>

—Reply to this = email directly, view it on GitHub, or unsubscribe.You are receiving this be= cause you commented.Message ID: ***@***.***>

—
Rep= ly to this email directly, view it on GitHub, or unsubscribe.
You are = receiving this because you authored the thread.Message ID: ***@***.***>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are rece= iving this because you commented.Message ID: <ORNL-Fusi= on/aorsa/issues/49/2147574133@github= =2Ecom>

--

= -john
Principal Research Scientist John Wright
Office 617-253-96= 12
zoom: https://mit.zoom.us/my/jcwright

--=_bdd64717b3167e2d2780a239b48ebf89--

yang5891 commented 4 months ago

Hi teacher, I'm having a small problem modifying the arithmetic example. I am now modifying it to a ratio of 1:1 for D and T and a concentration of 0.1% for the third impurity Li. However, I have calculated with TORIC that the absorption of Li can reach about 80%, but with AORSA the electron absorption is negative (-3.5018 %) and the absorption of Li is only 2.1831 %.

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 21:16 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | First you need to verify the test case works so far you show me that you get errors for that-johnOn Jun 4, 2024, at 8:00 AM, yang5891 @.***> wrote: Thank you, teacher, but the number of cores in our group is 32, and I need to set more points and consider different ions and concentrations in the future. How should I set these two parameters?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 19:54 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) | It affects the decomposition of the matrix in the code. 2 x 2 is better than one by four. For the test cases you should not need to change anything. They just use one by one.-johnOn Jun 4, 2024, at 7:52 AM, yang5891 @.***> wrote: Thank you, teacher, for your patience these days. My brother in my research group helped me find the relevant parameters for calculating the number of cores (nprow x npcol = nproc).But is that okay as long as the product of the two is equal to the number of cores needed for the calculation? For example, if I use quad-core computing, is there a difference between 2x2 and 1x4?

---- Replied Message ---- | From | John C. @.> | | Date | 6/4/2024 00:48 | | To | @.> | | Cc | @.>, @.> | | Subject | Re: [ORNL-Fusion/aorsa] How to run properly on Centos server (Issue #49) |

It seems like you are having issues running with mpi. This error has to do with starting up and executing under mpi run, not with aorsa itself. I suggest you verify you can compile and run a simple mpi program. Assuming you are using intel compile, grab cpi.c from https://gist.github.com/jcwright77/a5e1d66886bc17b0f7936466739cc287

mpiicc cpi.c -o cpi mpirun -np 4 ./cpi

other things, verify you are using the correct mpirun, 'which mpirun' should show mpirun in the intel distribution Try making from scratch: make clean; make

Ask a college who is familiar with parallel programs on your system for help. You issues seem to be outside of aorsa and have to do with basic compilation and execution of parallel programs.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

ORNL-Fusion / aorsa

How to run properly on Centos server #49

ifeq ($(UNAME_R),18.7.0)

endif

pgplot

LIBS += -L/usr/lib/ -lpgplot

netcdf

scalapack

Links:

Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55252,1],0] Exit code: 174

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55098,1],0] Exit code: 174

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.

mpirun detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[55098,1],0] Exit code: 174

Links: