checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.97k stars 596 forks source link

Smaps show a incorrect result after criu restore a application which mmap MAP_SHARED | MAP_ANONYMOUS memory #2305

Open longweismile opened 11 months ago

longweismile commented 11 months ago

Description

Steps to reproduce the issue: 1.Created new C file which mmap with MAP_SHARED | MAP_ANONYMOUS flag.My application mmap 8M hugepages. 2.Do dump and restore action. 3.Check /proc/myapplition/smaps

Describe the results you received: Before dump and after restore, the smaps are different. image

Describe the results you expected: Before dump and after restore, the smaps are the same.

Additional information you deem important (e.g. issue happens only occasionally):

CRIU logs and information:

CRIU full dump/restore logs:

``` (paste your output here) ```

Output of `criu --version`:

``` (paste your output here) ```

Output of `criu check --all`:

``` (paste your output here) ```

Additional environment details:

Snorch commented 11 months ago

I see that smaps changed, but I don't see any big problem with it.

For instance Rss has changed from 8M to 0. Because all those 8M of memory are just zeroes in your case there is no need to actually hold them in physical memory, and zero Rss just shows that this memory was optimized, if your app will write to this memory after restore you will see Rss grow.

As you didn't fill the issue fully, skipping required information, there is no chance to say that THPeligible change from 1 to 0 or getting /memfd: prefix appear in the mapping description is related to some problem in CRIU THP handling or just the fact that your kernel does not support it.

@minhbq-99 You might be interested in it.

minhbq-99 commented 11 months ago

getting /memfd: prefix appear in the mapping description is related to some problem in CRIU THP handling I think the /memfd: prefix is expected. In CRIU, we restore shared memory mapping by creating a fd = memfd_create() then mmap(MAP_FILE, fd) and restore the content. In the child process which shares the same mapping, we just mmap with the same fd. If memfd is not available, we fallback to open /proc/pid/map_files/start-end instead.

AFAIK, CRIU does not do any differences when handling THP backed memory mapping than the normal one.

I wrote a small zdtm but cannot reproduce the issue on my machine

#include <sys/mman.h>
#include <stdio.h>
#include "zdtmtst.h"

#define MEM_SIZE (8UL * (1UL << 20)) /* 2MB */

#define is_hex_digit(c) (((c) >= '0' && (c) <= '9') || ((c) >= 'a' && (c) <= 'f') || ((c) >= 'A' && (c) <= 'F'))

static int is_vma_range_fmt(char *line, unsigned long *start, unsigned long *end)
{
    char *p = line;
    while (*line && is_hex_digit(*line))
        line++;

    if (*line++ != '-')
        return 0;

    while (*line && is_hex_digit(*line))
        line++;

    if (*line++ != ' ')
        return 0;

    sscanf(p, "%lx-%lx", start, end);
    return 1;
}

int main(int argc, char **argv)
{
    void *m1;
    int  is_shared_map = 0;
    FILE *smaps = NULL;
    char buf[1024];
    unsigned long start = 0, end = 0;
    uint32_t crc;

    test_init(argc, argv);
    m1 = mmap(NULL, MEM_SIZE, PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, 0, 0);
    if (m1 == MAP_FAILED) {
        pr_perror("Failed to mmap %lu Mb anonymous shared memory", MEM_SIZE >> 20);
        return 1;
    }
    crc = ~0;
    datagen(m1, MEM_SIZE, &crc);

    smaps = fopen("/proc/self/smaps", "r");
    if (!smaps) {
        pr_perror("Can't open smaps");
        return -1;
    }
    while (fgets(buf, sizeof(buf), smaps)) {
        if (strstr(buf, "/dev/zero")) {
            is_shared_map = 1;
        }
        is_vma_range_fmt(buf, &start, &end);
        if (is_shared_map) {
            test_msg("%s", buf);
            if (!strncmp(buf, "VmFlags: ", 9)) {
                is_shared_map = 0;
                break;
            }
        }
    }
    fclose(smaps);

    test_daemon();
    test_waitsig();

    smaps = fopen("/proc/self/smaps", "r");
    if (!smaps) {
        pr_perror("Can't open smaps");
        return -1;
    }

    while (fgets(buf, sizeof(buf), smaps)) {
        if (strstr(buf, "/dev/zero")) {
            is_shared_map = 1;
        }
        is_vma_range_fmt(buf, &start, &end);
        if (is_shared_map) {
            test_msg("%s", buf);
            if (!strncmp(buf, "VmFlags: ", 9)) {
                is_shared_map = 0;
                break;
            }
        }
    }
    fclose(smaps);

    *((int *) m1) = 1;

    smaps = fopen("/proc/self/smaps", "r");
    if (!smaps) {
        pr_perror("Can't open smaps");
        return -1;
    }

    while (fgets(buf, sizeof(buf), smaps)) {
        if (strstr(buf, "/dev/zero")) {
            is_shared_map = 1;
        }
        is_vma_range_fmt(buf, &start, &end);
        if (is_shared_map) {
            test_msg("%s", buf);
            if (!strncmp(buf, "VmFlags: ", 9)) {
                is_shared_map = 0;
                break;
            }
        }
    }
    fclose(smaps);

    return 0;
}

The log from test Before checkpoint

16:37:24.394:     5: 7efe7ec00000-7efe7f400000 -w-s 00000000 00:01 11300                      /dev/zero (deleted)
16:37:24.395:     5: Size:               8192 kB
16:37:24.395:     5: KernelPageSize:        4 kB
16:37:24.395:     5: MMUPageSize:           4 kB
16:37:24.395:     5: Rss:                8192 kB
16:37:24.395:     5: Pss:                8192 kB
16:37:24.395:     5: Shared_Clean:          0 kB
16:37:24.395:     5: Shared_Dirty:          0 kB
16:37:24.395:     5: Private_Clean:         0 kB
16:37:24.395:     5: Private_Dirty:      8192 kB
16:37:24.395:     5: Referenced:         8192 kB
16:37:24.395:     5: Anonymous:             0 kB
16:37:24.395:     5: LazyFree:              0 kB
16:37:24.395:     5: AnonHugePages:         0 kB
16:37:24.395:     5: ShmemPmdMapped:     8192 kB
16:37:24.395:     5: FilePmdMapped:         0 kB
16:37:24.395:     5: Shared_Hugetlb:        0 kB
16:37:24.395:     5: Private_Hugetlb:       0 kB
16:37:24.395:     5: Swap:                  0 kB
16:37:24.395:     5: SwapPss:               0 kB
16:37:24.395:     5: Locked:                0 kB
16:37:24.395:     5: THPeligible:    1
16:37:24.395:     5: ProtectionKey:         0
16:37:24.395:     5: VmFlags: wr sh mr mw me ms sd

Right after restore

16:37:24.943:     5: 7efe7ec00000-7efe7f400000 -w-s 00000000 00:01 5152                       /memfd:/dev/zero (deleted)
16:37:24.943:     5: Size:               8192 kB
16:37:24.943:     5: KernelPageSize:        4 kB
16:37:24.943:     5: MMUPageSize:           4 kB
16:37:24.943:     5: Rss:                   0 kB
16:37:24.943:     5: Pss:                   0 kB
16:37:24.944:     5: Shared_Clean:          0 kB
16:37:24.944:     5: Shared_Dirty:          0 kB
16:37:24.944:     5: Private_Clean:         0 kB
16:37:24.944:     5: Private_Dirty:         0 kB
16:37:24.944:     5: Referenced:            0 kB
16:37:24.944:     5: Anonymous:             0 kB
16:37:24.944:     5: LazyFree:              0 kB
16:37:24.944:     5: AnonHugePages:         0 kB
16:37:24.944:     5: ShmemPmdMapped:        0 kB
16:37:24.944:     5: FilePmdMapped:         0 kB
16:37:24.944:     5: Shared_Hugetlb:        0 kB
16:37:24.944:     5: Private_Hugetlb:       0 kB
16:37:24.944:     5: Swap:                  0 kB
16:37:24.944:     5: SwapPss:               0 kB
16:37:24.944:     5: Locked:                0 kB
16:37:24.944:     5: THPeligible:    1
16:37:24.944:     5: ProtectionKey:         0
16:37:24.944:     5: VmFlags: wr sh mr mw me ms sd

After restore and a write to restored mapping

16:37:24.945:     5: 7efe7ec00000-7efe7f400000 -w-s 00000000 00:01 5152                       /memfd:/dev/zero (deleted)
16:37:24.945:     5: Size:               8192 kB
16:37:24.945:     5: KernelPageSize:        4 kB
16:37:24.945:     5: MMUPageSize:           4 kB
16:37:24.945:     5: Rss:                2048 kB
16:37:24.945:     5: Pss:                2048 kB
16:37:24.945:     5: Shared_Clean:          0 kB
16:37:24.945:     5: Shared_Dirty:          0 kB
16:37:24.945:     5: Private_Clean:         0 kB
16:37:24.945:     5: Private_Dirty:      2048 kB
16:37:24.945:     5: Referenced:         2048 kB
16:37:24.945:     5: Anonymous:             0 kB
16:37:24.945:     5: LazyFree:              0 kB
16:37:24.945:     5: AnonHugePages:         0 kB
16:37:24.945:     5: ShmemPmdMapped:     2048 kB
16:37:24.945:     5: FilePmdMapped:         0 kB
16:37:24.945:     5: Shared_Hugetlb:        0 kB
16:37:24.945:     5: Private_Hugetlb:       0 kB
16:37:24.945:     5: Swap:                  0 kB
16:37:24.945:     5: SwapPss:               0 kB
16:37:24.945:     5: Locked:                0 kB
16:37:24.945:     5: THPeligible:    1
16:37:24.945:     5: ProtectionKey:         0
16:37:24.945:     5: VmFlags: wr sh mr mw me ms sd 

@longweismile Do you use the lazy migration?

github-actions[bot] commented 10 months ago

A friendly reminder that this issue had no activity for 30 days.