metno / emep-ctm

Open Source EMEP/MSC-W model
GNU General Public License v3.0
27 stars 18 forks source link

Minimum domain size? #103

Closed ln-x closed 2 years ago

ln-x commented 2 years ago

Hi, I am trying to run EMEP with meteorological WRF forcings with a size of 99x99 grid points (I tried with 9km and 3km horizontal resolution input files). I am receiving the error:

buffer too small 132 99 159 99 STOP-ALL ERROR: GetCDF buffer too small Abort(9) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 9) - process 0

Is there a limitation of domain size? I set the "RUNDOMAIN" option, but the error persists. Apart from the meteorological file - I use the input files supplied with the model code.

I am very thankful for any hint on where to look for errors. Kind regards, Heidi

emepctm.out.txt

avaldebe commented 2 years ago

Hi @wasserblum

This error tells you that the model subdomain on each processor is too small. How may processors are you using?

ln-x commented 2 years ago

Hi, I was using:

#SBATCH --nodes=2 --tasks-per-node=32  

"Each node has 2 Intel Skylake Platinum 8174 processors with 24 cores each, that is 48 physical cores per node and a total of 37,920 cores for the whole system."

I was trying with only one node and one task per node now - this raises the same error.

Heidi

avaldebe commented 2 years ago

Hi Heidi,

#SBATCH --nodes=2 --tasks-per-node=32  

If I remember correctly, the processor subdomain needs to be at least 5x5 grids points. With 64 processors each subdomain should be around 12x12 grid points, so this is not the problem.

buffer too small 132 99 159 99 STOP-ALL ERROR: GetCDF buffer too small

However, the error message you got comes from a different par of the code (my bad, I should have read more carefully)

Can I see the RUNDOMAIN on your config and an ncdump -h of one of your WRF files?

ln-x commented 2 years ago

Hi, this is beginning of the config_emep.nml - but I received the error also before I included RUNDOMAIN. The full file is attached.

&Model_config
  GRID      = 'UOZONE',
  iyr_trend = 2018,
  runlabel1 = 'uozone',
!  runlabel2 = 'uozone',
  startdate = 2018,02,06,17,
  enddate   = 2018,03,20,08,
!-----------------------------
  EXP_NAME    = 'EMEPSTD',
  DataPath(1) = '.',
!-----------------------------sub domain x01,x1,y0,y1
  RUNDOMAIN = 30, 100, 20, 90,
!-----------------------------
  meteo                 = './wrfout_d02_2018-02-06_17:00:00',
  DegreeDayFactorsFile  = './DegreeDayFactors.nc',
  EmisHeightsFile       = './EmisHeights.txt',
  MonthlyFacFile        = './MonthlyFac.POLL',
  DailyFacFile          = './DailyFac.POLL',
  HourlyFacFile         = './HourlyFacs.INERIS',
  EMEP_EuroBVOCFile     = './EMEP_EuroBVOC.nc',
  DustFile              = './Dust.nc',
  TopoFile              = './topoGRID.nc',
  SitesFile             = './sites.dat',
  SondesFile            = './sondes.dat',
!------------------------------
[...]

config_emep.nml.txt

ln-x commented 2 years ago

This is the top of the ncdump -h of my metforcing. (The full output is attached as file):

netcdf wrfout_d02_2018-02-06_17\:00\:00 {
dimensions:
        Time = UNLIMITED ; // (1000 currently)
        DateStrLen = 19 ;
        west_east = 99 ;
        south_north = 99 ;
        bottom_top = 39 ;
        bio_emissions_dimension_stag = 41 ;
        klevs_for_dvel = 1 ;
        bottom_top_stag = 40 ;
        soil_layers_stag = 4 ;
        west_east_stag = 100 ;
        south_north_stag = 100 ;
variables:
        char Times(Time, DateStrLen) ;
        float CLDFRA2(Time, bottom_top, south_north, west_east) ;
                CLDFRA2:FieldType = 104 ;
                CLDFRA2:MemoryOrder = "XYZ" ;
                CLDFRA2:description = "CLOUD FRACTION" ;
                CLDFRA2:units = "-" ;
                CLDFRA2:stagger = "" ;
                CLDFRA2:coordinates = "XLONG XLAT XTIME" ;
        float RAINPROD(Time, bottom_top, south_north, west_east) ;
                RAINPROD:FieldType = 104 ;
                RAINPROD:MemoryOrder = "XYZ" ;
                RAINPROD:description = "TOTAL RAIN PRODUCTION RATE" ;
                RAINPROD:units = "s-1" ;
                RAINPROD:stagger = "" ;
                RAINPROD:coordinates = "XLONG XLAT XTIME" ;
        float EVAPPROD(Time, bottom_top, south_north, west_east) ;
                EVAPPROD:FieldType = 104 ;
                EVAPPROD:MemoryOrder = "XYZ" ;
                EVAPPROD:description = "RAIN EVAPORATION RATE" ;
                EVAPPROD:units = "s-1" ;
                EVAPPROD:stagger = "" ;
                EVAPPROD:coordinates = "XLONG XLAT XTIME" ;

ncdumpwrf.txt

avaldebe commented 2 years ago
        west_east = 99 ;
        south_north = 99 ;
  RUNDOMAIN = 30, 100, 20, 90,

The domain defined by your meteorology has 99x99 grid points, so your run domain can be at the most:

  RUNDOMAIN = 30, 99, 20, 90,

I'm not sure that trimming the RUNDOMAIN will solve the issue, but it is worth a try

ln-x commented 2 years ago

Hi,

I tried - this does not solve the error. Any other ideas where I could look for errors? Is it possible that the domain is simply too small out of another reason? The error is raised from within NetCDF_mod.f90 (line 1952):

./NetCDF_mod.f90: write(,)'buffer too small',dims(1),varGIMAX,dims(2),varGJMAX ./NetCDF_mod.f90: Call StopAll('GetCDF buffer too small')

From emepctm.out I figured out that one file that was read in after Dust.nc was concerned (USES%DUST = F, which did not affect the "buffer to small" error). I also knew from ncdump -h that DustDayFactor.nc and topoEECCA.nc has the dimensions i=132,J=159. So first I tried and set USES%DEGREEDAY_FACTORS = F, which still raised a "buffer to small" error. Then I uncommented "topoGRID.nc" in the config_emep.nml. Now EMEP is running and first results look plausible. But where does EMEP take the topographic information from now?

Heidi


PS.: If I include DEBUG%DEBUG_NETCDF = T or DEBUG%DEBUG_DUST = T i in the config_emep.nml I receive an error (see below). How do I set the debug option?

forrtl: severe (19): invalid reference to variable in NAMELIST input, unit 28, file /binfl/lv71449/htrimmel3/emep-ctm-2018-05-01d1/config_emep.nml, line 13, position 15

gitpeterwind commented 2 years ago

Hi, the domain is too small compared to the number of procs. Try to reduce the number of cpus (tasks)

avaldebe commented 2 years ago

From emepctm.out I figured out that one file that was read in after Dust.nc was concerned (USES%DUST = F, which did not affect the "buffer to small" error). I also knew from ncdump -h that DustDayFactor.nc and topoEECCA.nc has the dimensions i=132,J=159. So first I tried and set USES%DEGREEDAY_FACTORS = F, which still raised a "buffer to small" error. Then I uncommented "topoGRID.nc" in the config_emep.nml. Now EMEP is running and first results look plausible. But where does EMEP take the topographic information from now?

The topography file needs to have exactly the same grid that the meteorology. Without topographic info some pollutant sources will be switched off, like the passive SO2 degassing from volcanoes and volcanic ash from historic eruptions.

ln-x commented 2 years ago

Hi, the domain is too small compared to the number of procs. Try to reduce the number of cpus (tasks)

I am using: #SBATCH --nodes=1 --ntasks-per-node=1

ln-x commented 2 years ago

Thank you for your replies! Could you answer my questions regarding the debug options? Then I can close this issue.

"If I include DEBUG%DEBUG_NETCDF = T or DEBUG%DEBUG_DUST = T i in the config_emep.nml I receive an error (see below). How do I set the debug option?"

forrtl: severe (19): invalid reference to variable in NAMELIST input, unit 28, file /binfl/lv71449/htrimmel3/emep-ctm-2018-05-01d1/config_emep.nml, line 13, position 15

avaldebe commented 2 years ago

"If I include DEBUG%DEBUG_NETCDF = T or DEBUG%DEBUG_DUST = T i in the config_emep.nml I receive an error (see below). How do I set the debug option?"

The DEBUG configuration variable does not expose DEBUG_NETCDF and DEBUG_DUST, i.e. there are no DEBUG%DEBUG_NETCDF/DEBUG%DEBUG_DUST configuration options.

DEBUG_NETCDF and DEBUG_DUST are constants (parameters in Fortran) defined in Debug_module

https://github.com/metno/emep-ctm/blob/23ae40f91edbaa4e8b1613ed8c44d6ab0562b3b5/Debug_module.f90#L95-L118

In order to produce the debug outputs you want you need to edit heir definition and recompile the program.

ln-x commented 2 years ago

Ah! Thank you. Now it is clear.