flang-compiler / flang

Flang is a Fortran language front-end designed for integration with LLVM.
Other
801 stars 136 forks source link

521.wrf_r fails with Runtime Error with Flang #216

Open sekharbvrs opened 7 years ago

sekharbvrs commented 7 years ago

521.wrf in CPU2017 FP fails with Runtime error with flang. It fails after handling file read call in module_ra_rrtm IF ( wrf_dm_on_monitor() ) READ (rrtm_unit,ERR=9010) abscoefL1, abscoefH1, SELFREF1

pawosm-arm commented 7 years ago

can you disclose what compilation flags did you use?

sekharbvrs commented 7 years ago

It fails with -m64 -O1 on a haswell machine.

sscalpone commented 7 years ago

Failed with -02 on haswell with 'test' data set. It passed with -O2 with train and ref data sets. It runs to completion with the test dataset but shows these diffs in the final results:

0105: RAINC PASSED RAINC 118 2 0.3299484612E-01 0.3333083425E-01 1 0.1892E-02 0.9289E-01 ^ 0214: CANWAT PASSED CANWAT 694 2 0.8641670423E-06 0.8538953710E-06 1 0.1206E-06 0.1620E+00 ^ 0353: CANWAT PASSED CANWAT 691 2 0.2099058840E-05 0.2011426787E-05 1 0.3017E-06 0.1653E+00

AndreiLux commented 5 years ago

I'm hijacking this for a related issue.

First of all if you're getting IO errors, do not forget to compile with -Mbyteswapio as the input data set is in big endian form and I assume most people here are compiling on little endian systems.


Having fixed that, the benchmark continues on to a segmentation fault:


Starting program: /mnt/d/SPEC2006/benchdir/521.wrf_r
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
--- WARNING: traj_opt is zero, but num_traj is not zero; setting num_traj to zero.
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval = 0 for all domains
--- NOTE: grid_fdda is 0 for domain      1, setting gfdda interval and ending time to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain      1, setting sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain      1, setting obs nudging interval and ending time to 0 for that domain.
bl_pbl_physics /= 4, implies mfshconv must be 0, resetting
--- NOTE: num_soil_layers has been set to      4
WRF V3.6.1 MODEL
 *************************************
 Parent domain
 ids,ide,jds,jde             1           74            1           61
 ims,ime,jms,jme            -4           79           -4           66
 ips,ipe,jps,jpe             1           74            1           61
 *************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
   alloc_space_field: domain             1 ,                 128355772  bytes allocated
  med_initialdata_input: calling input_input
   Resetting the hypsometric_opt from default value of 2 to 1
Timing for processing wrfinput file (stream 0) for domain        1:    0.02166 elapsed seconds
INPUT LandUse = "USGS"
 LANDUSE TYPE = "USGS" FOUND           33  CATEGORIES            2  SEASONS WATER CATEGORY =            16  SNOW CATEGORY =            24
INITIALIZE THREE Noah LSM RELATED TABLES
 LANDUSE TYPE = USGS FOUND           27  CATEGORIES
 INPUT SOIL TEXTURE CLASSIFICATION = STAS
 SOIL TEXTURE CLASSIFICATION = STAS FOUND           19  CATEGORIES
Timing for processing lateral boundary for domain        1:    0.00537 elapsed seconds
 Tile Strategy is not specified. Assuming 1D-Y
WRF TILE   1 IS      1 IE     74 JS      1 JE     61
WRF NUMBER OF TILES =   1
Timing for main: time 2000-01-24_12:01:00 on domain   1:    0.41608 elapsed seconds
Timing for main: time 2000-01-24_12:02:00 on domain   1:    0.17695 elapsed seconds
Timing for main: time 2000-01-24_12:03:00 on domain   1:    0.19176 elapsed seconds
Timing for main: time 2000-01-24_12:04:00 on domain   1:    0.19540 elapsed seconds

Program received signal SIGSEGV, Segmentation fault.
0x0000000000ddd06b in module_cu_kfeta::kf_eta_para () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/module_cu_kfeta.F90-pp.f90:865
865          !     NK = LC-1
(gdb) backtrace
#0  0x0000000000ddd06b in module_cu_kfeta::kf_eta_para () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/module_cu_kfeta.F90-pp.f90:865
#1  0x0000000000ddaf00 in module_cu_kfeta::kf_eta_cps () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/module_cu_kfeta.F90-pp.f90:428
#2  0x0000000000e8b813 in module_cumulus_driver::cumulus_driver () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/module_cumulus_driver.F90-pp.f90:711
#3  0x00000000010e697c in module_first_rk_step_part1::first_rk_step_part1 () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/module_first_rk_step_part1.F90-pp.f90:996
#4  0x0000000001706aa2 in solve_em () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/solve_em.F90-pp.f90:699
#5  0x0000000001743aba in solve_interface () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/solve_interface.F90-pp.f90:38
#6  0x00000000011330e3 in module_integrate::integrate () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/module_integrate.F90-pp.f90:313
#7  0x00000000016aded6 in module_wrf_top::wrf_run () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/module_wrf_top.F90-pp.f90:422
#8  0x000000000177524e in wrf () at 521.wrf_r/CMakeFiles/521.wrf_r.dir/wrf.F90-pp.f90:31
#9  0x00000000004ec156 in main ()
#10 0x00000000019d12d9 in __libc_start_main ()
#11 0x00000000004ec02a in _start ()
(gdb)

Related code snippet:

! ...calculate dewpoint using lookup table...
!
          astrt=1.e-3
          ainc=0.075
          a1=emix/aliq
          tp=(a1-astrt)/ainc
          indlu=int(tp)+1
          value=(indlu-1)*ainc+astrt
          aintrp=(a1-value)/ainc
          tlog=aintrp*alu(indlu+1)+(1-aintrp)*alu(indlu)
          TDPT=(CLIQ-DLIQ*TLOG)/(BLIQ-TLOG)
          TLCL=TDPT-(.212+1.571E-3*(TDPT-T00)-4.36E-4*(TMIX-T00))*(TMIX-TDPT)
          TLCL=AMIN1(TLCL,TMIX)
          TVLCL=TLCL*(1.+0.608*QMIX)
          ZLCL = ZMIX+(TLCL-TMIX)/GDRY
     !     NK = LC-1
     !     DO 
     !       NK = NK+1
     !       KLCL=NK

I'm building this outside of the SPEC toolset, but with the official flags.

Am I missing part of the literal.pm preprocessor requirements of the toolset?

d-parks commented 5 years ago

Are you running with an unlimited stack size? I can't tell from the log snippet you've included.

AndreiLux commented 5 years ago

I had it raised to 16MB. I just tried again with 256MB and it segfaults at the same location. I'm using the pre-built binaries, I'll try to build it from the latest master when I get to it.

d-parks commented 5 years ago

Those limits might be small. Also have you enabled OpenMP? If so, you'll slso need to set the environment variable OMP_STACKSIZE (can't remember if there is a hyphen between STACK and SIZE or not).

AndreiLux commented 5 years ago

No OpenMP at the moment, currently I'm just trying to get it to run at all.

I read some guy on the Intel Fortran compiler was able to just run it with 16MB, just to make sure I tried again with 4GB. Unless the issue is that my linker actually isn't correctly setting the stack size.

Edit: Nope, the stack is being set correctly:

andrei@DESKTOP-02D5VQL:/mnt/d/SPEC2006/benchdir$ readelf -We 521.wrf_r | grep "STACK"
readelf: Warning: [ 1]: Link field (0) should index a symtab section.
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x100000000 RW  0
d-parks commented 5 years ago

Are you using the bash shell? Can you try ulimit -s unlimited?

On Thu, Dec 27, 2018, 10:12 Andrei F. <notifications@github.com wrote:

No OpenMP at the moment, currently I'm just trying to get it to run at all.

I read some guy on the Intel Fortran compiler was able to just run it with 16MB, just to make sure I tried again with 4GB. Unless the issue is that my linker actually isn't correctly setting the stack size.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/flang-compiler/flang/issues/216#issuecomment-450181235, or mute the thread https://github.com/notifications/unsubscribe-auth/AA7sB0dIsfwfsMqGp25_6MVpQdg0ZRNHks5u9PFvgaJpZM4POVR0 .

AndreiLux commented 5 years ago

The shell's stack size doesn't matter as this is a new process and not a new thread that would inherit its size, setting it to unlimited doesn't change the segfault.

I'll try to build the newest master and work myself from there.

sscalpone commented 5 years ago

This issue came up today in slack flang-compiler # general.

furuame commented 5 years ago

Hi all! I successfully build and run the wrf with test workload on the commit 65cab13188dd14c5ef2d575499097e0da1dd4c0f

Setting Up Run Directories
  Setting up 521.wrf_r test base mytest-m64 (1 copy): run_base_test_mytest-m64.0001
Running Benchmarks
  Running 521.wrf_r test base mytest-m64 (1 copy) [2019-05-12 01:56:37]
Success: 1x521.wrf_r
Producing Raw Reports
 label: mytest-m64
  workload: test
   metric: SPECrate2017_fp_base
    format: raw -> /home/cwei/spec2017/result/CPU2017.262.fprate.test.rsf
Parsing flags for 521.wrf_r base: done
Doing flag reduction: done
    format: flags -> /home/cwei/spec2017/result/CPU2017.262.fprate.test.flags.html
    format: cfg -> /home/cwei/spec2017/result/CPU2017.262.fprate.test.cfg, /home/cwei/spec2017/result/CPU2017.262.fprate.test.orig.cfg
    format: CSV -> /home/cwei/spec2017/result/CPU2017.262.fprate.test.csv
    format: PDF -> /home/cwei/spec2017/result/CPU2017.262.fprate.test.pdf
    format: HTML -> /home/cwei/spec2017/result/CPU2017.262.fprate.test.html
    format: Text -> /home/cwei/spec2017/result/CPU2017.262.fprate.test.txt
The log for this run is in /home/cwei/spec2017/result/CPU2017.262.log

And thanks @AndreiLux for your notification on the big endian issue.