s117 / anycore-riscv

The AnyCore toolset targetting the RISC-V ISA
Other
0 stars 0 forks source link

644.nab_s_ref failed at PK assertion #11

Closed s117 closed 4 years ago

s117 commented 4 years ago

The binary is compiled by the Linux Toolchain https://github.com/s117/riscv-gnu-toolchain/commit/d0bdaa9a282a32cc68e6203098dc1162021ceba7

$ spike -m16384 pk -c nab_s_base.riscv-m64 3j1n 20140317 220
Requesting target memory 0x400000000
******* Resetting core **********
****Initializing the processor system****
******* Resetting core **********
******* Resetting core **********
****Initialization complete****
nabmd 3j1n 20140317

Reading .pdb file (3j1n/3j1n.pdb)
Reading .prm file (3j1n/3j1n.prm)
title:
default_name
assertion failed: __do_mmap(current.brk, newbrk_page - current.brk, -1, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) == current.brk
cycle = 494377042
instret = 494377049
******* Resetting core **********
s117 commented 4 years ago

The trace shows __do_mmap() function returned soon after __vmr_alloc() is returned:

image

When looking into the raw trace after cycle 494910398 (when __vmr_alloc() is just returned), we can see right after __vmr_alloc() is returned, a branch with PC 0x3c2c is fetched, and it diverged the next PC to 0x3d34.

image

This branch come from pk/vm.c, line 238, and it's taken path is on line 239

$ riscv64-unknown-linux-gnu-addr2line -e pk 0x3c2c 0x3d34            
pk/vm.c:238
pk/vm.c:239

image

So, the __do_mmap() failed because there is no free VMR left.

s117 commented 4 years ago

The number of VMR is statically defined in PK. I tried to increase the total VMR from 32 to 64, but it doesn't makes too much difference:

$ spike -m16384 pk -c nab_s_base.riscv-m64 3j1n 20140317 220  # with 64 VMR PK
Requesting target memory 0x400000000
******* Resetting core **********
****Initializing the processor system****
******* Resetting core **********
******* Resetting core **********
****Initialization complete****
nabmd 3j1n 20140317

Reading .pdb file (3j1n/3j1n.pdb)
Reading .prm file (3j1n/3j1n.prm)
title:
default_name
assertion failed: __do_mmap(current.brk, newbrk_page - current.brk, -1, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) == current.brk
cycle = 494819233
instret = 494819240
******* Resetting core **********
s117 commented 4 years ago

If __do_mmap() gets frequently called, but the mapped vpages keep unpopulated, VMR will be exhausted soon.

Two solutions should work:

  1. If VMR resources are exhausted and we need a VMR, force populates some vpages to free up a VMR.

  2. Just like how pages are reserved for building the page table, also reserve some space at high memory for VMRs, so PK will have more VMRs to use. I cannot simply increase the MAX_VMR because that will result in an oversize kernel that cannot fit into the 0x2000~0x10000 (vaddr planned for kernel).

s117 commented 4 years ago

459.GemsFDTD and 481.wrf compiled from the same toolchain also failed at this assertion, not sure whether they have the same cause.

459.GemsFDTD

assertion failed: __do_mmap(current.brk, newbrk_page - current.brk, -1, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) == current.brk
cycle = 2325940963
instret = 2325940970

481.wrf

assertion failed: __do_mmap(current.brk, newbrk_page - current.brk, -1, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) == current.brk
cycle = 156789950262
instret = 156789950269
s117 commented 4 years ago

Increased the number of VMR from 32 to 1024: https://github.com/s117/riscv-pk/commit/776b7de8d692beeaf0d518842e2d63e4ba33d6ab

459.GemsFDTD works with this 1024 VMRs PK.

Requesting target memory 0x80000000
******* Resetting core ********** 
****Initializing the processor system****
******* Resetting core ********** 
******* Resetting core ********** 
****Initialization complete****
 Welcome to GemsFDTD

 *******PROBLEMSIZE*************************
 problemsize:         210         210         210
 *******CELLSIZE****************************
 dx:   1.0000000000000000E-002
 dy:   1.0000000000000000E-002
 dz:   1.0000000000000000E-002
 *******NSTEP*******************************
 nstep:        1000
 *******PLANEWAVE*************************
 wavedir:   0.0000000000000000        0.0000000000000000        1.0000000000000000     
 Epol:   1.0000000000000000        0.0000000000000000       -0.0000000000000000     
 PulseType:           6
 param:   1.2000000000000000E-009   2.0000000000000001E-010  -1111.0999999999999     
 X0:   0.0000000000000000        0.0000000000000000        0.0000000000000000     
 Disttobound:          10
 ***************PEC*************************
 Number of components:        28686
 *******OUTERBOUNDARY***********************
 #cells:          12
 R0:    1.0000000000000000E-004
 proftype:           4
 *******CFL*********************************
 CFL:  0.94999999999999996     
 *******NFTRANS_TD**************************
 Disttobound:          15
 Theta_angles_no:           4
 Theta_interval:   180.00000000000000        360.00000000000000     
 Phi_angles_no:           1
 Phi_interval:   0.0000000000000000        0.0000000000000000     
 Filenamebase: sphere_td
 *******PROGRESS****************************
 progress,iteration:   100.00000000000000     

 dt*c0 =    5.4848275573014449E-003
 Number of timesteps to propagate one (x)-cell:  1.82
 Ret. time arrays allocated in Huygens_mod, bytes used =      7041024
 Searching for patches in PECinit:
        7172 X-patches found
        7172 Y-patches found
        7172 Z-patches found
 Fields allocated in UPML_mod, bytes used =    181570800

 pml_cells =           12
 Using polynomial profile for sigma, with degree           4
 Arrays allocated in NFT_mod, bytes used =     57126016
 Size of leading dimension for H-fields = 234
 Size of leading dimension for E-fields = 235
 Fields allocated in leapfrog_mod, bytes used =    618978696
 (   171260352  bytes of this is due to UPML)

 Total amount of Mbytes allocated =   824

 ============================================================
 Entering timestepping loop
 Total number of iterations are         1000
 Starting timer in the beginning of the second time step
 Iteration no.          100  has been completed
 Iteration no.          200  has been completed
 Iteration no.          300  has been completed
 Iteration no.          400  has been completed
 Iteration no.          500  has been completed
 Iteration no.          600  has been completed
 Iteration no.          700  has been completed
 Iteration no.          800  has been completed
 Iteration no.          900  has been completed
 Iteration no.         1000  has been completed
 Ending timestepping loop
 No. of Allocations (         150 )/(         150 ) deallocations.  Ok 
cycle = 1632340031995
instret = 1632340032002
******* Resetting core ********** 
s117 commented 4 years ago

1024 VMR patch also fixed 481.wrf_ref and 644.nab_s_ref

But 481.wrf_ref raised a warning about "IEEE_UNDERFLOW_FLAG" at the end of the program. Don't know whether this is a big problem yet.

$ spike -m2048 pk -c ./wrf_base.riscv

Requesting target memory 0x80000000
******* Resetting core ********** 
****Initializing the processor system****
******* Resetting core ********** 
******* Resetting core ********** 
****Initialization complete****
 INITIALIZE THREE Noah LSM RELATED TABLES
 INPUT LANDUSE = USGS
 LANDUSE TYPE = USGS FOUND          27  CATEGORIES
 ......
main: time 2001-06-11_12:30:00 em_t_2     11.46939087
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG
cycle = 3067041781830
instret = 3067041781837
******* Resetting core ********** 
$ spike -m16384 pk -c nab_s_base.riscv-m64 3j1n 20140317 220

Requesting target memory 0x400000000
******* Resetting core ********** 
****Initializing the processor system****
******* Resetting core ********** 
******* Resetting core ********** 
****Initialization complete****
nabmd 3j1n 20140317

Reading .pdb file (3j1n/3j1n.pdb)
Reading .prm file (3j1n/3j1n.prm)
title:
default_name                                                                    
      iter    Total
ff:     0 490665981616042475520.00
Initial energy is 490665981616042475520.0000000
Starting molecular dynamics with Born solvation energy...

      iter    Total
ff:     0 490665981616042475520.00
ff:   220 747711.63

...Done, md returns 0
      iter    Total
ff:     0 826436.84
Initial energy is 826436.8377200
Starting molecular dynamics with in vaccuo non-bonded energy...

      iter    Total
ff:     0 826436.84
ff:   220 194297.18

...Done, md returns 0
cycle = 12351191455525
instret = 12351191455532
******* Resetting core ********** 
s117 commented 4 years ago

Commit da71238 incorporated the 1024 VMR PK patch (https://github.com/s117/riscv-pk/commit/776b7de8d692beeaf0d518842e2d63e4ba33d6ab)