OpenMPDK / SMDK

SMDK, Scalable Memory Development Kit, is developed for Samsung CXL(Compute Express Link) Memory Expander to enable full-stack Software-Defined Memory system
271 stars 60 forks source link

cxlswap interface cannot be used properly #28

Closed luohuang017 closed 6 months ago

luohuang017 commented 6 months ago

When I tested it using test_cxlswap.c, I found that I was being oom killed. Upon inspection, it seems that there is an issue with the avoid_oom function in util.h . you are passing the -17 parameter to oom_score_adj instead of oom_adj. But even after I made the modification, the program still cannot use the cxlswap interface properly; instead, it just stops directly when reaching the memory limit.

SeungjunHa commented 6 months ago

Did you run test_cxlswap directly or using run_cxlswap_storeload_test.sh, etc?

If you run directly, then there is a possibility that CXL Swap Interface is not turned on. If not turned on, then system just try to swap out to swap space (ssd, hdd, etc.). ("$ cat /sys/module/cxlswap/parameters/enabled" should true or y)

If CXL Swap Interface is already turned on, and if you run shared memory test, then can you try after decreasing the test size in run_cxlswap_sharedmemory_test.sh. It takes so long to finish that it appears as if it has stopped.

I also wonder if your system has swap space? (you can check it by "$ swapon -s")

As I know, what oom_adj for -17 means is just to request that it be given the lowest priority in the target of oom_kill. But when the system is difficulty to operate PFRA cause it has no Swap Space, then your application will be killed frequently even the oom_adj_score is -17(below -16). And, as I know the meaning of oom_adj and oom_score_adj is same, but oom_adj exists for backward compatilibty. (https://github.com/torvalds/linux/blob/master/Documentation/filesystems/proc.rst)

luohuang017 commented 6 months ago

Did you run test_cxlswap directly or using run_cxlswap_storeload_test.sh, etc?

If you run directly, then there is a possibility that CXL Swap Interface is not turned on. If not turned on, then system just try to swap out to swap space (ssd, hdd, etc.). ("$ cat /sys/module/cxlswap/parameters/enabled" should true or y)

If CXL Swap Interface is already turned on, and if you run shared memory test, then can you try after decreasing the test size in run_cxlswap_sharedmemory_test.sh. It takes so long to finish that it appears as if it has stopped.

I also wonder if your system has swap space? (you can check it by "$ swapon -s")

As I know, what oom_adj for -17 means is just to request that it be given the lowest priority in the target of oom_kill. But when the system is difficulty to operate PFRA cause it has no Swap Space, then your application will be killed frequently even the oom_adj_score is -17(below -16). And, as I know the meaning of oom_adj and oom_score_adj is same, but oom_adj exists for backward compatilibty. (https://github.com/torvalds/linux/blob/master/Documentation/filesystems/proc.rst)

Thanks for your answer! I found that I did not create the swap partition. So I created a swap file. But the program still doesn't work and I still can't figure out what's causing the problem. QQ截图20240314104122

SeungjunHa commented 6 months ago

Is the environment you run QEMU? In my experience, emulating and turning CXL in QEMU can operate very slowly and look like it has stopped. And also, your system has CXL Memory in the form of Memory Only NUMA Node?

luohuang017 commented 6 months ago

Is the environment you run QEMU? In my experience, emulating and turning CXL in QEMU can operate very slowly and look like it has stopped. And also, your system has CXL Memory in the form of Memory Only NUMA Node?

I run the program on my host. QQ截图20240314132502

SeungjunHa commented 6 months ago

Umm.. In my system, the test works well... and storeload test is very simple test...

  1. Did you run as root? Because this test needs cgroup interface (if system doesn't use cgroup before, then test try to mount cgroup interface and this operation needs root privilege. The version of cgroup isn't matter.)
  2. Can you check the number of "/sys/kernel/debug/cxlswap/stored_pages" is change while running test? (In my case, it chaged very frequently)
luohuang017 commented 6 months ago

Umm.. In my system, the test works well... and storeload test is very simple test... Can you check the number of "/sys/kernel/debug/cxlswap/stored_pages" is change while running test? (In my case, it chaged very frequently)

I tried.But it didnot change:( It always be zero.

SeungjunHa commented 6 months ago

Oh, then can you check the number of reject_alloc_fail? Can you check the THP option? CXLSwap is not allowed thp swap out case. I think your case is that PFRA works and try to swap out by cxlswap, but allocation to cxl memory is fail and finally killed by OOM maybe...

luohuang017 commented 6 months ago

Oh, then can you check the number of reject_alloc_fail? Can you check the THP option? CXLSwap is not allowed thp swap out case. I think your case is that PFRA works and try to swap out by cxlswap, but allocation to cxl memory is fail and finally killed by OOM maybe...

The file "/sys/kernel/mm/transparent_hugepage/enabled" was originally set to [madvise]. After I changed it to [always] and ran program again, it still failed. The "reject_alloc_fail" and "stored_pages" remain at 0 both before and after running it.The picture is the result about "dmesg | grep cxlswap". QQ截图20240314184420

SeungjunHa commented 6 months ago

Sorry for late response. Umm.. Maybe, the PFRA operations doesn't work at all... I think there may be some errors in cgroup setting.. I will do some more tests. Can you check the status of zswap of your system ("$ cat /sys/module/zswap/parameters/enabled") And can you show me the status of /sys/module/cxlswap/parameters/*?

luohuang017 commented 6 months ago

SeungjunHa commented Mar 15, 2024

QQ截图20240314104122

SeungjunHa commented 6 months ago

Thank you for response. Can you turn off zswap($ echo N > /sys/module/zswap/parameters/enabled) and try again cxlswap_storeload_test? Because zswap and cxlswap are frontswap api, if both are turned on simultaneously, zswap is used first in smdk kernel. I think if you turned off zswap and turn on cxlswap, then the cxlswap's counters in debugfs are should be changed while doing test.

luohuang017 commented 6 months ago

Thank you for response. Can you turn off zswap($ echo N > /sys/module/zswap/parameters/enabled) and try again cxlswap_storeload_test? Because zswap and cxlswap are frontswap api, if both are turned on simultaneously, zswap is used first in smdk kernel. I think if you turned off zswap and turn on cxlswap, then the cxlswap's counters in debugfs are should be changed while doing test.

It still be killed. QQ截图20240314132502

SeungjunHa commented 6 months ago

Umm.. can you check /proc/sys/vm/ variables like swappiness, overcommit_memory, panic_on_oom. In my case, $ cat /proc/sys/vm/swappiness 60 (If I set this value to zero, then I also got OOM Killed case.. I think the value of this variable is different..) $ cat /proc/sys/vm/panic_on_oom 0 $ cat /proc/sys/vm/overcommit_memory 0

luohuang017 commented 6 months ago

Umm.. can you check /proc/sys/vm/ variables like swappiness, overcommit_memory, panic_on_oom. In my case, $ cat /proc/sys/vm/swappiness 60 (If I set this value to zero, then I also got OOM Killed case.. I think the value of this variable is different..) $ cat /proc/sys/vm/panic_on_oom 0 $ cat /proc/sys/vm/overcommit_memory 0

That's my case, truely different from you.

QQ截图20240314132502

SeungjunHa commented 6 months ago

Good. It may have been solved. If you don't mind, I will close the thread.

luohuang017 commented 6 months ago

Umm.. can you check /proc/sys/vm/ variables like swappiness, overcommit_memory, panic_on_oom. In my case, $ cat /proc/sys/vm/swappiness 60 (If I set this value to zero, then I also got OOM Killed case.. I think the value of this variable is different..) $ cat /proc/sys/vm/panic_on_oom 0 $ cat /proc/sys/vm/overcommit_memory 0

Good. It may have been solved. If you don't mind, I will close the thread.

Yes,thanks for your time.