Closed s117 closed 4 years ago
The trace shows __do_mmap() function returned soon after __vmr_alloc() is returned:
When looking into the raw trace after cycle 494910398 (when __vmr_alloc() is just returned), we can see right after __vmr_alloc() is returned, a branch with PC 0x3c2c is fetched, and it diverged the next PC to 0x3d34.
This branch come from pk/vm.c
, line 238, and it's taken path is on line 239
$ riscv64-unknown-linux-gnu-addr2line -e pk 0x3c2c 0x3d34
pk/vm.c:238
pk/vm.c:239
So, the __do_mmap() failed because there is no free VMR left.
The number of VMR is statically defined in PK. I tried to increase the total VMR from 32 to 64, but it doesn't makes too much difference:
$ spike -m16384 pk -c nab_s_base.riscv-m64 3j1n 20140317 220 # with 64 VMR PK
Requesting target memory 0x400000000
******* Resetting core **********
****Initializing the processor system****
******* Resetting core **********
******* Resetting core **********
****Initialization complete****
nabmd 3j1n 20140317
Reading .pdb file (3j1n/3j1n.pdb)
Reading .prm file (3j1n/3j1n.prm)
title:
default_name
assertion failed: __do_mmap(current.brk, newbrk_page - current.brk, -1, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) == current.brk
cycle = 494819233
instret = 494819240
******* Resetting core **********
If __do_mmap() gets frequently called, but the mapped vpages keep unpopulated, VMR will be exhausted soon.
Two solutions should work:
If VMR resources are exhausted and we need a VMR, force populates some vpages to free up a VMR.
Just like how pages are reserved for building the page table, also reserve some space at high memory for VMRs, so PK will have more VMRs to use. I cannot simply increase the MAX_VMR because that will result in an oversize kernel that cannot fit into the 0x2000~0x10000 (vaddr planned for kernel).
459.GemsFDTD and 481.wrf compiled from the same toolchain also failed at this assertion, not sure whether they have the same cause.
459.GemsFDTD
assertion failed: __do_mmap(current.brk, newbrk_page - current.brk, -1, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) == current.brk
cycle = 2325940963
instret = 2325940970
481.wrf
assertion failed: __do_mmap(current.brk, newbrk_page - current.brk, -1, MAP_FIXED|MAP_PRIVATE|MAP_ANONYMOUS, 0, 0) == current.brk
cycle = 156789950262
instret = 156789950269
Increased the number of VMR from 32 to 1024: https://github.com/s117/riscv-pk/commit/776b7de8d692beeaf0d518842e2d63e4ba33d6ab
459.GemsFDTD works with this 1024 VMRs PK.
Requesting target memory 0x80000000
******* Resetting core **********
****Initializing the processor system****
******* Resetting core **********
******* Resetting core **********
****Initialization complete****
Welcome to GemsFDTD
*******PROBLEMSIZE*************************
problemsize: 210 210 210
*******CELLSIZE****************************
dx: 1.0000000000000000E-002
dy: 1.0000000000000000E-002
dz: 1.0000000000000000E-002
*******NSTEP*******************************
nstep: 1000
*******PLANEWAVE*************************
wavedir: 0.0000000000000000 0.0000000000000000 1.0000000000000000
Epol: 1.0000000000000000 0.0000000000000000 -0.0000000000000000
PulseType: 6
param: 1.2000000000000000E-009 2.0000000000000001E-010 -1111.0999999999999
X0: 0.0000000000000000 0.0000000000000000 0.0000000000000000
Disttobound: 10
***************PEC*************************
Number of components: 28686
*******OUTERBOUNDARY***********************
#cells: 12
R0: 1.0000000000000000E-004
proftype: 4
*******CFL*********************************
CFL: 0.94999999999999996
*******NFTRANS_TD**************************
Disttobound: 15
Theta_angles_no: 4
Theta_interval: 180.00000000000000 360.00000000000000
Phi_angles_no: 1
Phi_interval: 0.0000000000000000 0.0000000000000000
Filenamebase: sphere_td
*******PROGRESS****************************
progress,iteration: 100.00000000000000
dt*c0 = 5.4848275573014449E-003
Number of timesteps to propagate one (x)-cell: 1.82
Ret. time arrays allocated in Huygens_mod, bytes used = 7041024
Searching for patches in PECinit:
7172 X-patches found
7172 Y-patches found
7172 Z-patches found
Fields allocated in UPML_mod, bytes used = 181570800
pml_cells = 12
Using polynomial profile for sigma, with degree 4
Arrays allocated in NFT_mod, bytes used = 57126016
Size of leading dimension for H-fields = 234
Size of leading dimension for E-fields = 235
Fields allocated in leapfrog_mod, bytes used = 618978696
( 171260352 bytes of this is due to UPML)
Total amount of Mbytes allocated = 824
============================================================
Entering timestepping loop
Total number of iterations are 1000
Starting timer in the beginning of the second time step
Iteration no. 100 has been completed
Iteration no. 200 has been completed
Iteration no. 300 has been completed
Iteration no. 400 has been completed
Iteration no. 500 has been completed
Iteration no. 600 has been completed
Iteration no. 700 has been completed
Iteration no. 800 has been completed
Iteration no. 900 has been completed
Iteration no. 1000 has been completed
Ending timestepping loop
No. of Allocations ( 150 )/( 150 ) deallocations. Ok
cycle = 1632340031995
instret = 1632340032002
******* Resetting core **********
1024 VMR patch also fixed 481.wrf_ref and 644.nab_s_ref
But 481.wrf_ref raised a warning about "IEEE_UNDERFLOW_FLAG" at the end of the program. Don't know whether this is a big problem yet.
$ spike -m2048 pk -c ./wrf_base.riscv
Requesting target memory 0x80000000
******* Resetting core **********
****Initializing the processor system****
******* Resetting core **********
******* Resetting core **********
****Initialization complete****
INITIALIZE THREE Noah LSM RELATED TABLES
INPUT LANDUSE = USGS
LANDUSE TYPE = USGS FOUND 27 CATEGORIES
......
main: time 2001-06-11_12:30:00 em_t_2 11.46939087
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG
cycle = 3067041781830
instret = 3067041781837
******* Resetting core **********
$ spike -m16384 pk -c nab_s_base.riscv-m64 3j1n 20140317 220
Requesting target memory 0x400000000
******* Resetting core **********
****Initializing the processor system****
******* Resetting core **********
******* Resetting core **********
****Initialization complete****
nabmd 3j1n 20140317
Reading .pdb file (3j1n/3j1n.pdb)
Reading .prm file (3j1n/3j1n.prm)
title:
default_name
iter Total
ff: 0 490665981616042475520.00
Initial energy is 490665981616042475520.0000000
Starting molecular dynamics with Born solvation energy...
iter Total
ff: 0 490665981616042475520.00
ff: 220 747711.63
...Done, md returns 0
iter Total
ff: 0 826436.84
Initial energy is 826436.8377200
Starting molecular dynamics with in vaccuo non-bonded energy...
iter Total
ff: 0 826436.84
ff: 220 194297.18
...Done, md returns 0
cycle = 12351191455525
instret = 12351191455532
******* Resetting core **********
Commit da71238 incorporated the 1024 VMR PK patch (https://github.com/s117/riscv-pk/commit/776b7de8d692beeaf0d518842e2d63e4ba33d6ab)
The binary is compiled by the Linux Toolchain https://github.com/s117/riscv-gnu-toolchain/commit/d0bdaa9a282a32cc68e6203098dc1162021ceba7