Closed gklarenberg closed 8 years ago
Out put file paths are specified in this section
!---------------------------------------------------------------------------------------!
! FFILOUT -- Path and prefix for analysis files (all but history/restart). !
! SFILOUT -- Path and prefix for history files. !
!---------------------------------------------------------------------------------------!
NL%FFILOUT = '/mypath/generic-prefix'
NL%SFILOUT = '/mypath/generic-prefix'
I'm not familiar with AFILOUT -- that's probably a typo, and may be the route of your error. File paths should be absolute.
Second, output file types are controlled here
!---------------------------------------------------------------------------------------!
! ED2 File output. For all the variables 0 means no output and 3 means HDF5 output. !
! !
! IFOUTPUT -- Fast analysis. These are mostly polygon-level averages, and the time !
! interval between files is determined by FRQANL !
! IDOUTPUT -- Daily means (one file per day) !
! IMOUTPUT -- Monthly means (one file per month) !
! IQOUTPUT -- Monthly means of the diurnal cycle (one file per month). The number !
! of points for the diurnal cycle is 86400 / FRQANL !
! IYOUTPUT -- Annual output. !
! ITOUTPUT -- Instantaneous fluxes, mostly polygon-level variables, one file per year. !
! ISOUTPUT -- restart file, for HISTORY runs. The time interval between files is !
! determined by FRQHIS !
!---------------------------------------------------------------------------------------!
NL%IFOUTPUT = 0
NL%IDOUTPUT = 0
NL%IMOUTPUT = 3
NL%IQOUTPUT = 0
NL%IYOUTPUT = 0
NL%ITOUTPUT = 3
NL%ISOUTPUT = 3
At least one of these things should be set to 3 -- if they're all 0 ED won't write any outputs. None should be set to 1 or 2, those are older deprecated file formats.
@mdietze Thanks! That worked. And I also thought it might be a typo, but I came across https://github.com/EDmodel/ED2/wiki/Misc-parameters in which AFILOUT is referenced, so I wasn't sure. I also realized that the ED2IN files in src/testcases have a bunch of deprecated namelist inputs. I am updating everything now using the ED2IN file in /run
If it is okay, I am going to continue on this thread concerning test run issues? I get the errors
>>>> opspec_grid error! in your namelist!
---> Reason: Too few soil layers. Set it to at least 2. Your nzg is currently set to -999...
>>>> opspec_grid error! in your namelist!
---> Reason: Too few maximum # of snow layers. Set it to at least 1. Your nzs is currently set to -999.
However, in ED2IN, these are definitely specified:
NL%NZG = 9
NL%NZS = 1
Are there any other settings that would affect the way these are read?
SLZ, SLMSTR, and STGOFF must also be of length NZG
I think you can also get this error if you're initializing the model from the wrong state (i.e. trying to restart as a HISTORY run and the histo file doesn't have all the proper fields).
@crollinson That were my thoughts too. But SLZ, SLMSTR and STGOFF are
NL%SLZ = -2.307, -1.789, -1.340, -0.961, -0.648, -0.400, -0.215, -0.089, -0.020
NL%SLMSTR = 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00
NL%STGOFF = 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00
And since I don't have history files (I just want to do a very basic first run to make sure everything is being read properly), I have set
NL%RUNTYPE = 'INITIAL'
and
NL%IED_INIT_MODE = 0
I haven't commented out NL%SFILIN
, NL%ITIMEH
, NL%IDATEH
, NL%IMONTHH
, NL%IYEARH
, thinking these won't be used anyway considering NL%IED_INIT_MODE = 0
? I have set everything else to the easiest settings, like NL%ISOILFLG = 2
.
NL%NSLCON
has only one value (11), but I didn't think that was associated with the soil layers?
Otherwise, are there any history files available somewhere (for the Amazon region) that I could use to start a simulation?
hmmmmm. I don't work in the Amazon, so I can't help you there.
Something else to check: Are ISOILSTATEINIT and ISOILDEPTHFLG = 0 ? If not, you're trying to read from the soil database which could be buggy. This database should be declared by: SOIL_DATABASE, SOIL_STATE_DB, and SOILDEPTH_DB.
@crollinson Yes, although I specified all those databases, I still set ISOILSTATEINIT and ISOILDEPTHFLG both to 0
do you have your full ED2IN uploaded somewhere? Might be easier if I (or someone else) could glance through the full thing for something we may be overlooking. I could then also try it with my older version of ED to help figure out if this is a bug introduced by recent changes. I've had quite a few issues with bare ground spinup with the mainline version.
Okay, I think the problem may be coming from you trying to define two regions of interest: NL%ED_REG_LATMIN = -15.0, 10.0 ! list of minimum latitudes of the ED regions NL%ED_REG_LATMAX = 0.0, 20.0 NL%ED_REG_LONMIN = -85.0, 50.0 NL%ED_REG_LONMAX = -50.0, 60.0
You have N_ED_REGION = 0 and N_POI=1, so I think all of the above should only have a single value. Try removing the second number after each of those and see what happens.
@crollinson Also just tried that... No luck... Could it have something to do with using MPI? I don't have much experience with that... I have compiled ED2 on our HPC (intel/2016.0.109 openmpi/1.10.2 hdf5/1.8.17) and I've noticed that trying to a serial run with ./ed_2.1-opt gives me an error:
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
PMI2_Job_GetId failed failed
--> Returned value (null) (14) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_ess_init failed
--> Returned value (null) (14) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "(null)" (14) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[i21a-s2.ufhpc:31133] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!
I checked with our HPC support staff but didn't get much of a response. They said "For a parallel run, you do not need to specify the nodes. As long as you
request the appropriate amount of resources in your job script everything
should run as intended."
So I am running it as mpiexec ed_2.1-opt
in a dev session right now but since edmain.F90 says Read the namelist and initialize the variables in the nodes if needed.
I realized maybe I should specify nodes?
Though mpirun -np 1 ./ed_2.1-opt
gives me the same error...
It could be a problem with MPI -- if the processes aren't being linked and spawned properly, you can have this sort of initialization error. Unfortunately how the parallelization is setup and executed can be very system-specific and you'll probably have to talk with someone with more knowledge of your system.
I'd try re-compiling ED as a serial process and see if that solves your problem. If it does, then you'll have to work with your IT or someone else local that understands ED/the system to figure out the appropriate compiling and running flags.
I've also been using SMP which i think is different from the original MPI setup in terms of how sites & memory is shared. If you only have 1 point and want to run in parallel, you need to use the SMP (shared memory processing?) because it does need to share the settings and latest time step across nodes.
@crollinson Thanks for your help - I've reached out to our HPC support staff to see if they have any ideas. I didn't even realize ED can be compiled differently for serial processes!
Does this help with the soil data check problems? https://github.com/EDmodel/ED2/issues/170
@fabeit Thanks, I gave that a try: set it up to read the FAO soil database etc, but got the same error (I tried it both with NZG and NZS commented out, and not commented out). I can't work out why it is not reading the actual values, I worry maybe there is something wrong with the settings of the ED2IN file? It's been a long time since I worked with Fortran...
Have a look at my ed2in, you will have to change the coordinates of the simulation but you can check the other settings. I am doing a bare ground run and read soil info from FAO db. ed2in_ew1_i.txt
@fabeit Thanks, I tried your ED2IN file, and strangely I get errors again, not for NZG and NZS but virtually everything else, starting from ISOILSTATEINIT. I still think there might be something wrong with reading the data: does anyone know if these are the first error messages that show up if the text file is not read in properly, or should I get an error message about earlier variables (such as RUNTYPE and regional/POI runs, which is what I deduct from going through ed_1st.F90 and ed_opspec.F90)? As an FYI, I have a Mac and edit text files in TextEdit or TextMate (I opened @fabeit 's file on my computer too). I upload files to our HPC, which is a Linux system with Intel Fortran. I thought issues mostly arise with Unix/Windows systems, not Unix/Linux systems but it's the only thing I can think of. (Also, I cleaned and deleted all ED2 files, downloaded them anew from this Github, recompiled, but the issue remains)
I apologize for creating such a long thread for what turns out to be a simple solution... I suspected a / character in the file paths in ED2IN were creating the problem, and upon closer inspection, it turns out some of them were in 'curly' apostrophes ('smart quotes')... Which I guess made Fortran stop reading the input file (I had copy-pasted my paths into @fabeit 's input file, of course). Evidently TextEdit on Mavericks has smart quotes turned on by default!
Let me suggest sublime text ;-)
Hi - I am just getting started with ED2: I'd like to look at the model's sensitivity to disturbance and land use changes eventually, but for now I am just trying to get some test runs done. My study area is the Amazon, so I started off using (adjusting) files in src/test_cases/amazon_soi I have the feeling I'm running into a very basic issue but I don't know how to solve it:
invalid reference to variable in NAMELIST input, unit 10, file (...my file path here...)ED2IN, line 57, position 13
Am I referencing the output folder wrong (the NL%AFILOUT line is line 57)? I've tried a couple of things (full pathname, start with ./, no prefix, setting IFOUTPUT to 1), but I keep getting the same error. I'm not sure what the output is supposed to be specified as?