Open tajila opened 2 years ago
@ashu-mehra Can you please describe the sample application you ran to demonstrate this issue?
This issue happens when the application is using library functions that may have more than one implementation to take advantage of hardware features. Examples are memcpy, memset, etc.
The decision about the implementation to use is taken by ld
at runtime based on hardware feature flags available and it patches the GOT
entry with the address of the implementation to be used. If this resolution and patching happens before taking checkpoint, then it can cause problem if the restore is done on a machine which does not support the specific hardware feature flags.
I am currently experimenting with a C program that uses memset:
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
char *ptr = NULL;
char * get_user_ptr() {
return ptr;
}
int get_user_size() {
return 512;
}
void set_user_bytes(char ch) {
memset(get_user_ptr(), ch, get_user_size());
}
int is_prime(int num) {
int j;
int divisors = 0;
for (j = 2; j < num-1; j++) {
if (num % 2 == 0) {
divisors += 1;
}
}
if (divisors == 0) {
return 1;
} else {
return 0;
}
}
void busyloop() {
int i = 0;
int count = 0;
for (i = 0; i < 99999*2; i++) {
int rc = is_prime(i);
if (rc == 1) {
count += 1;
}
}
}
int main(void) {
ptr = malloc(512);
set_user_bytes('a');
busyloop();
set_user_bytes('b');
printf("Finished successfully\n");
return 0;
}
In this program set_user_bytes
uses memset
function.
Purpose of busyloop()
is just to give me enough time to take a checkpoint. So the first call to set_user_bytes
happens before checkpoint and second call happens after restore.
If we take a checkpoint on a system with AVX2
feature, and restore on a system without this feature, we would hit SIGILL
.
An update on this issue:
I have been able to use a glibc tunable glibc.cpu.hwcaps
[0] to disable certain cpu features when starting the application for checkpointing. This has worked well to overcome the issue with glibc/ld as mentioned above.
My tests were done on Fedora 30 system which has glibc 2.29. GLIBC_TUNABLES
were set as:
export GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable,-XSAVE_Usable,-AVX2_Usable,-ERMS,-AVX_Usable,-AVX_Fast_Unaligned_Load
The set of flags passed to tunable glibc.cpu.hwcaps
would obviously depend on the cpu features available on the systems involved in the experiment.
Note that glibc 2.29 has a bug which prevented disabling XSAVE feature using GLIBC_TUNABLES
and also prevented correct selection of dl_runtime_resolve_*
function. This issue[1] has already been fixed in latest release of 2.34.
For 2.29 I have to make couple of changes to address these issues and recompile glibc.
The flags passed to glibc.cpu.hwcaps
have also undergone a change in newer glibc releases. The flags that are recognized by tunable glibc.cpu.hwcaps
can be seen in cpu-tunables.c
[2] in glibc sources.
[0] https://www.gnu.org/software/libc/manual/html_node/Hardware-Capability-Tunables.html [1] https://sourceware.org/bugzilla/show_bug.cgi?id=27605 [2] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/x86/cpu-tunables.c;h=58f7a7f2509bedb967a13e0af2cd434c33079f18;hb=HEAD
@ashu-mehra thanks for the update. Just so I'm export GLIBC_TUNABLES=...
needs to be set before the application starts? And there is no way to do this after the application has started (but before checkpoint) ?
Right, it needs to be set before launching the application.
https://www.gnu.org/software/libc/manual/html_node/Tunables.html suggests that there may be other ways to enable this:
It is possible to implement multiple ‘frontends’ for the tunables allowing distributions to choose their
preferred method at build time
I asked the question above because the JVM already has a notion of Portable/non-Portable restore mode, but the JVM needs to be lauched before it know which mode it is in. Without this info we need to be pessimistic.
Just for completeness, Ill post the solution I was thinking of exploring:
Reload glibc on restore:
I never got around to trying this.
@tajila I am not sure I understand how reloading of glibc would solve this problem. Once the GOT
entry for a library function is patched with an implementation, then the ld
would not attempt to resolve it again. So loading glibc again may not help here. It may require editing the checkpoint image to "unresolve" the GOT entries.
This can actually be achieved using env variable LD_BIND_NOT
[1] which would keep GOT entries unresolved, but not sure if this env variable has any role in selecting dl_runtime_resolve_*
function.
CRIU image may not be portable due to glibc using newer features even if the jvm restricts itself to an older set of features.
Creating an image with an older glibc might help if it doesn't have updated versions for those features (ie: avx) but there are techniques which allows intercepting the cpuid instruction to control the selected feature set which should more broadly cover x86 software, https://github.com/ddcc/libcpuidoverride.