containerd / accelerated-container-image

A production-ready remote container image format (overlaybd) and snapshotter based on block-device.
Apache License 2.0
409 stars 75 forks source link

Random application failures potentially caused by rootfs data corruption #213

Closed shuaichang closed 7 months ago

shuaichang commented 1 year ago

What happened in your environment?

Hi, We are seeing some Java application cannot load class errors including failed to unzip jars. It's not 100% reproducible and we seemed only see this issue when the backend object storage throws errors more often then normal level.

Our question is how do we handle GET blob error or timeouts or reading non-intact blob blocks? e.g. if a block failed to be read or is corrupted, is there any chance the corrupted data will be returned to application and causing application errors?

What did you expect to happen?

Application should be started with no error related to data corruption.

How can we reproduce it?

It is not reproducible, it seems occur more frequently when the back storage throws errors.

What is the version of your Accelerated Container Image?

What is your OS environment?

Ubuntu

Are you willing to submit PRs to fix it?

shuaichang commented 1 year ago

We now confirmed the following points:

  1. There is indeed a corruption in Overlaybd rootfs, we checked 2 different containers with using the same overlaybd image, the md5sum of the same file returned different checksum.
  2. The timing of application failure aligned with overlaybd failed to read due to connection timeout.

So I feel that there's somewhere in the overlaybd code path that did not handle the connection error well and caused rootfs corruption.

|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:435|pread:checksum verification and reload failed
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/lsmt/file.cpp:572|operator():failed to read from 1-th file ( 00005639DBA38A30 pread return: -1 < size: 4096) errno=117(Structure needs cleaning)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/cache/frontend/cached_file.cpp:181|preadvInternal:src file read failed, read : -1, expectRead : 1048576, size_ : 1730350592, offset : 703594496, sum : 1048576 errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:271|read_blocks:read compressed blocks failed. (offset: 704424416, len: 4041) errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/cache/frontend/cached_file.cpp:181|preadvInternal:src file read failed, read : -1, expectRead : 1048576, size_ : 1730350592, offset : 703594496, sum : 1048576 errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:260|reload:read compressed blocks failed. (offset: 704424416, len: 4041) errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:432|pread:checksum failed {offset: 0, length: 4037} (expected 100465888 but got 2665479548), reload result: -1
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:435|pread:checksum verification and reload failed
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/lsmt/file.cpp:572|operator():failed to read from 1-th file ( 00005639DBA38A30 pread return: -1 < size: 4096) errno=117(Structure needs cleaning)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/cache/frontend/cached_file.cpp:181|preadvInternal:src file read failed, read : -1, expectRead : 1048576, size_ : 1730350592, offset : 703594496, sum : 1048576 errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:271|read_blocks:read compressed blocks failed. (offset: 704367450, len: 4049) errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/cache/frontend/cached_file.cpp:181|preadvInternal:src file read failed, read : -1, expectRead : 1048576, size_ : 1730350592, offset : 703594496, sum : 1048576 errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:260|reload:read compressed blocks failed. (offset: 704367450, len: 4049) errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:432|pread:checksum failed {offset: 0, length: 4045} (expected 1908934547 but got 170782710), reload result: -1
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:435|pread:checksum verification and reload failed
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/lsmt/file.cpp:572|operator():failed to read from 1-th file ( 00005639DBA38A30 pread return: -1 < size: 4096) errno=117(Structure needs cleaning)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/cache/frontend/cached_file.cpp:181|preadvInternal:src file read failed, read : -1, expectRead : 1048576, size_ : 1730350592, offset : 703594496, sum : 1048576 errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:271|read_blocks:read compressed blocks failed. (offset: 704367450, len: 4049) errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/cache/frontend/cached_file.cpp:181|preadvInternal:src file read failed, read : -1, expectRead : 1048576, size_ : 1730350592, offset : 703594496, sum : 1048576 errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:260|reload:read compressed blocks failed. (offset: 704367450, len: 4049) errno=110(Connection timed out)
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:432|pread:checksum failed {offset: 0, length: 4045} (expected 1908934547 but got 170782710), reload result: -1
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/zfile/zfile.cpp:435|pread:checksum verification and reload failed
|ERROR|th=00007FC7DEDE9B00|/src/src/overlaybd/lsmt/file.cpp:572|operator():failed to read from 1-th file ( 00005639DBA38A30 pread return: -1 < size: 4096) errno=117(Structure needs cleaning)
2023/07/12 23:51:46|ERROR|th=00007FC7D3D5EF40|/src/src/overlaybd/cache/frontend/cached_file.cpp:181|preadvInternal:src file read failed, read : -1, expectRead : 1048576, size_ : 1730350592, offset : 1321205760, sum : 1048576 errno=110(Connection timed out)
liulanzheng commented 1 year ago

During fixing #205 we found some risk codes in zfile and enhance/fix them in https://github.com/containerd/overlaybd/commit/bf6c41dbeea40aaf0dfac30fb9967127d1ed01c7. In some cases, it does result in corruption data returns. In the pre-fix version, data correctness depends entirely on crc verification and zfile decompression, without checking whether the data is correctly obtained from remote. Another background is that the CRC checksum of a data block is stored after the data and obtained from remote in a request. In common cases, even if the get request fails, some error message will be returned in http body. In http timeout situation, partial data will be returned. Crc verification cannot pass. I think there are two situations where accidents may occur if registry returns no body data in http request. 1, the buffer is empty(all zero), the crc32 will pass. But I'm not sure if zfile decompression will succeed. If succeed, empty data retures instead of the correct data. 2, In some extreme situation, the uninitialized buffer happens to store the last data and this block of data has the same compression size as the previous one. If the http request fails and with no http boby returned, the crc verification and zfile decompression will pass and the previous data will return.

shuaichang commented 1 year ago

Thanks @liulanzheng for the quick response.

Some more questions to follow up, for this error errno=110(Connection timed out), we can see it timed out at 30s, does this error mean overlaybd cannot establish a connection with blob server or the connection is already established but downloading takes longer than 30s?

shuaichang commented 1 year ago

I think I found a stable repro now

head -c 500MB </dev/urandom >file

md5sum ./file
19271400b4f36a3398aa20d9e825dd7b  ./file
# Dockerfile 
FROM ubuntu
ADD ./file /file

Result

We ran 3 times for 0.6.8 (currently in use) vs. 0.6.10 (contains the possible fix) to check the md5sum value.

overlaybd-tcmu 0.6.8 (what we are using)

Ran 3 times, it showed 3 different md5sum for the same file, which means the file is indeed corrupted

root@ip-10-0-0-134:/# md5sum /file
7aaec0678cb3522219aa7d20c5c63e64  /file

root@ip-10-0-0-134:/# md5sum ./file 
1ea93d9d8f0165aa3b9dc9e7b30382c4  ./file

root@ip-10-0-0-134:/# md5sum /file 
3c5210ed4c68dbf010c5c7bb614bdd59  /file

overlaybd-tcmu 0.6.10 (contains the possible fix)

Ran 3 times, it showed the same md5sum for the same file, which means the file is intact

root@ip-10-0-0-134:/# md5sum ./file 
19271400b4f36a3398aa20d9e825dd7b  ./file

root@ip-10-0-0-134:/# md5sum /file 
19271400b4f36a3398aa20d9e825dd7b  /file

root@ip-10-0-0-134:/# md5sum /file 
19271400b4f36a3398aa20d9e825dd7b  /file
liulanzheng commented 1 year ago

@shuaichang i didn't successfully reproduce this issue following your method. Could you provide the whole /var/log/overlaybd.log and whether there are errors in dmesg.

shuaichang commented 1 year ago

Verified that 0.6.12 fixed this corruption, again, thanks a lot @liulanzheng for your quick support! Please feel free to close this issue.

shuaichang commented 7 months ago

Can verify the fix and closing this issue.