dgist-datalab / trace_generator

Apache License 2.0
9 stars 1 forks source link

.pout particularly smaller than .vout #3

Open Alaric617R opened 1 year ago

Alaric617R commented 1 year ago

Hi, I noticed that when running extremely large trace, the memory foot print of .pout could be a magnitude smaller than that of .vout. I also noticed that in your after_run/make_physical_trace_ts.py, there is a logging in the end listing the percentage of none. What does that represent and how would the virtual to physical mapping be lost?

jwya2149 commented 1 year ago

I'm not sure exactly what caused the problem, but I updated the repository to a version that fixed the problem I found.

If the number of lines in .pout is significantly less than .vout, the kernel did not fully catch the V2P change for that process.

I found that when generating traces from some programs, the generated .pout was much smaller than .vout. One case was due to the fact that the kernel extracted only the V2P map for the pid of the main thread when different memory access was made in each multi-thread. So I modified to let run_script.sh deliver the list of pids, not a single pid, and the kernel extracts all the V2P maps for those pids. In the second case, the kernel could not extract the V2P map when page remapping occurred through an api such as mremap(), not just a page fault.

I added a commit to the revision of run_script.sh in this repository, and I also added it to the repository in Linux kernel. Do git pull in this repository, and do git pull in this link (the latest commit in modified kernel repository: https://github.com/dgist-datalab/cxl-kernel/tree/2b4e90c5960afc397f055719ea99fe4f13c0e37e)

none_cnt in make_physical_trace_ts.py counts the number of RW addresses in .vout that the PFN corresponding to the VPN is not in .vpmap.

If this doesn't solve the problem, please explain the situation in more detail.

Alaric617R commented 1 year ago

Hi, thanks for responding! I have a question on how you're going using the vp mapping to translate virtual trace to physical trace. If multiple processes coexists, it's possible they have same virtual address but mapped to different physical address. So how you tell each processes' virtual address apart?

jwya2149 commented 1 year ago

I'm sorry for the late reply. The script (run_script.sh) included in this tool sends the pids as module parameters to the kernel immediately after running the target application in background. (echo $pidlist> /sys/module/memory/parameters/target_pid)

The kernel reads the pids using module_param_array(). When the kernel extracts a Virtual-to-Physical mapping, it checks that the pid of the current process matches the received pid, and stores the mapping information in a separate space (/proc/vpmap) only if it matches.