checkpoint-restore / criu

Checkpoint/Restore tool
criu.org
Other
2.77k stars 561 forks source link

Unable to load config file #2291

Closed Tobeabellwether closed 8 months ago

Tobeabellwether commented 8 months ago

Description

Steps to reproduce the issue:

  1. I want to enable --tcp-established option for checkpointing
  2. So I created /etc/criu/default.conf and added "tcp-established"
  3. Checkpointing container with tcp connection, still failed. It seems that because $HOME cannot be found, the configuration read from /etc/criu/default.conf also becomes invalid.

CRIU logs and information:

CRIU full dump/restore logs:

``` (00.000000) Parsing config file /etc/criu/default.conf (00.000000) Unable to get $HOME directory, local configuration file will not be used. (00.000082) Version: 3.18 (gitid 5d2ceb3) (00.000093) Running on node3 Linux 5.15.0-1044-gcp #52~20.04.1-Ubuntu SMP Wed Sep 20 16:25:19 UTC 2023 x86_64 (00.000096) Would overwrite RPC settings with values from /etc/criu/runc.conf (00.000121) Loaded kdat cache from /run/criu.kdat (00.000143) Hugetlb size 2 Mb is supported but cannot get dev's number (00.000153) Hugetlb size 1024 Mb is supported but cannot get dev's number (00.000890) ======================================== (00.000917) Dumping processes (pid: 36756 comm: java) (00.000921) ======================================== (00.000927) rlimit: RLIMIT_NOFILE unlimited for self (00.000941) Running pre-dump scripts (00.000944) RPC (00.001175) irmap: Searching irmap cache in work dir (00.001189) No irmap-cache image (00.001193) irmap: Searching irmap cache in parent (00.001199) No parent images directory provided (00.001202) irmap: No irmap cache (00.001223) cpu: x86_family 6 x86_vendor_id GenuineIntel x86_model_id Intel(R) Xeon(R) CPU @ 2.20GHz ... (00.126752) sockets: Searching for socket 0x36c70 family 10 (00.126761) Error (criu/sk-inet.c:191): inet: Connected TCP socket, consider using --tcp-established option. (00.126831) ---------------------------------------- (00.126898) Error (criu/cr-dump.c:1669): Dump files (pid: 36756) failed with -1 (00.126915) Waiting for 36756 to trap (00.126989) Daemon 36756 exited trapping (00.127009) Sent msg to daemon 3 0 0 pie: 1: __fetched msg: 3 0 0 pie: 1: 1: new_sp=0x7faf9c275788 ip 0x7fafbe531d2b (00.127146) 36756 was trapped (00.127199) 36756 was trapped (00.127207) 36756 (native) is going to execute the syscall 15, required is 15 (00.127236) 36756 was stopped (00.127570) net: Unlock network (00.127576) Running network-unlock scripts (00.127580) RPC (00.132128) Unfreezing tasks into 1 (00.132153) Unseizing 36756 into 1 (00.132597) Error (criu/cr-dump.c:2093): Dumping FAILED. ```

Output of `criu --version`:

``` Version: 3.18 GitID: 5d2ceb3 ```

Output of `criu check --all`:

``` Looks good. ```

Additional environment details:

I'm using forensic container checkpointing in Kubernetes, which uses CRI-O and CRIU to checkpoint containers with tcp connections. Since I'm not using CRIU directly, I can only try to change CRIU's configuration file.

adrianreber commented 8 months ago

The way runc works you have to use a runc specific configuration file. Just call the file runc.conf.

Tobeabellwether commented 8 months ago

The way runc works you have to use a runc specific configuration file. Just call the file runc.conf.

@adrianreber Finally works, thanks a lot

adrianreber commented 8 months ago

Please close the ticket if solved.