NOAA-GFDL / AM4

19 stars 21 forks source link

AM4 run error #28

Open JINYONGKIM88 opened 3 years ago

JINYONGKIM88 commented 3 years ago

Hello,

I'm having trouble running a model. It's an error about land_frac, Can I know how to solve it? Below is the error message.

FATAL from PE 35: diag_manager_mod::register_static_field: module/output_field land/land_frac AREA measures field NOT found in diag_table. Contact the model liaison.n [cli_35]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 1) - process 35 FATAL from PE 26: diag_manager_mod::register_static_field: module/output_field land/land_frac AREA measures field NOT found in diag_table. Contact the model liaison.n …. (skip)

Thank you.

thomas-robinson commented 3 years ago

@JINYONGKIM88 Which version of AM4 are you using?

JINYONGKIM88 commented 3 years ago

@JINYONGKIM88 Which version of AM4 are you using?

As far as I know, it is AM 4.0. As written on READM.me on this website, I downloaded the source code using the command.

git clone --recursive https://github.com/NOAA-GFDL/AM4.git

thomas-robinson commented 3 years ago

Please copy the diag_table file in AM4/run/diag_table into whatever directory you are trying to run the model in. Let me know if that fixes the issue. If it does, I will update the README.

JINYONGKIM88 commented 3 years ago

Please copy the diag_table file in AM4/run/diag_table into whatever directory you are trying to run the model in. Let me know if that fixes the issue. If it does, I will update the README.

Thank you for your answer.

I tried copying AM4/run/diag_table and solved the problem. However, another error occurs.

error massages (skip) ............................. Phys_driver_term: Radiative g 0.000253 0.011588 0.001840 0.002422 0.000 21 0 35 Phys_driver_term: Radiation: 0.000000 0.000000 0.000000 0.000000 0.000 21 0 35 MPP_STACK high water mark= 0 [0] Failed to dealloc pd (Device or resource busy) [18] Failed to dealloc pd (Device or resource busy)

Now, we are turning the total number of cpu to 36. I used mvapich 2.2.1 and the mpiexec command.

input.nml &fv_core_nml layout = 2,3 io_layout = 1,3

When running a model, one node works well, but if you use two nodes, the error message above appears. How do I fix it? Thank you.

thomas-robinson commented 3 years ago

This sounds like a system or job submission problem. I haven't seen this problem running on multiple nodes. The default set up that I've tested uses 14 nodes.

JINYONGKIM88 commented 3 years ago

This sounds like a system or job submission problem. I haven't seen this problem running on multiple nodes. The default set up that I've tested uses 14 nodes.

Thank you for your kind reply.

I have the last question. The output that came out is nc0000 nc0001... Where can I get a program that I can post-processing to combine these into a single file?

uramirez8707 commented 3 years ago

@JINYONGKIM88 You can use mppnccombine from https://github.com/NOAA-GFDL/FRE-NCtools

JINYONGKIM88 commented 3 years ago

@JINYONGKIM88 You can use mppnccombine from https://github.com/NOAA-GFDL/FRE-NCtools

Thank you. :) Have a nice day!