HewlettPackard / quartz

Quartz: A DRAM-based performance emulator for NVM
https://github.com/HewlettPackard/quartz
Other
158 stars 66 forks source link

NVM delay does not work in the middle of program execution #26

Open Bongss opened 6 years ago

Bongss commented 6 years ago

Hello

I have some problem with NVM read delay.

In my case, as the size of the data increases, it seems that the NVM read delay does not work in the middle of program execution. but if the data size is small, it works well.

I attached the picture that I captured the part where delay did not work using debug mode.

image

What should I do?

I look forward to your reply

My Experiment setup

hadibrais commented 6 years ago

In all of the instances you highlighted, the stall cycles is zero. The Quartz PM model does not inject any delay if the stall cycles is zero. This is in general one of the flaws in the model because even if the stall cycles is zero, a delay might need to be injected nonetheless. But it is not a bug; it's by design.

seojiwon commented 6 years ago

Is it not possible that some bugs might cause the zero stall cycle? Because in our experiment, it is not possible for the stall cycle to be zero because we do a lot of random memory access. When the overall data size in the data structure is not large (700MB) we don't observe this oddity; we only see this behavior (0 stall cycle and 0 delay cycle) when we test with larger data size (> 1.5GB), and we observe this in a burst. Is it not possible that as the number of data items in a data structure increases, the overall memory access increases, hence there might be some overflows inside quartz?

guimagalhaes commented 6 years ago

In order to verify if there were stalls, the emulator needs to perform performance counter reads. The model is then calculating the ammount of cycles to be injected: the performance counters are read and the calculations are performed. These calculations take several cycles to be executed and the model attempts to account as the overhead by decreasing the overhead cycles from the cycles to be really injected. In your scenario, the overhead is taking more cycles than it is needed to be injected. So, the emulator is not injecting anything. Please check if the architecture/OS is acting to hide stalls by using large pages or if really random access is being performed.