containerd / overlaybd

Overlaybd: a block based remote image format. The storage backend of containerd/accelerated-container-image.
Apache License 2.0
259 stars 58 forks source link

Overlaybd stuck at errno=34(Numerical result out of range) #261

Closed shuaichang closed 11 months ago

shuaichang commented 1 year ago

What happened in your environment?

Containerd failed to create container due to level=error msg="CreateContainer within sandbox \"bcd2f25fed1429f65e063dfd689801744849a214d2219dcadd3d73c6f2745e95\" for &ContainerMetadata{Name:dbr-utils,Attempt:0,} failed" error="failed to create containerd container: failed to attach and mount for snapshot 103: failed to enable target for /sys/kernel/config/target/core/user_999999999/dev_103, failed:failed to create overlaybd device: unknown"

Overlaybd logs around the same

2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/zfile/zfile.cpp:355|get_current_block:inner buffer offset (0) + compressed size (18446744073709551612) overflow. errno=38(Function not implemented)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/zfile/zfile.cpp:499|pread:read compressed data failed. errno=38(Function not implemented)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/lsmt/file.cpp:1169|verify_ht:failed to read file trailer. errno=34(Numerical result out of range)
2023/09/08 21:09:14|INFO |th=00007F636C77D280|/src/src/overlaybd/lsmt/file.cpp:1531|do_parallel_load_index:check 5-th file is normal file or LSMT file
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/zfile/zfile.cpp:355|get_current_block:inner buffer offset (0) + compressed size (18446744073709551612) overflow. errno=38(Function not implemented)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/zfile/zfile.cpp:499|pread:read compressed data failed. errno=38(Function not implemented)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/lsmt/file.cpp:1169|verify_ht:failed to read file trailer. errno=34(Numerical result out of range)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/lsmt/file.cpp:1552|do_parallel_load_index:failed to load index from 5-th file
2023/09/08 21:09:14|INFO |th=00007F636FFB5300|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 0-th file done
2023/09/08 21:09:14|INFO |th=00007F636F7AC680|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 1-th file done
2023/09/08 21:09:14|INFO |th=00007F636EFA3280|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 2-th file done
2023/09/08 21:09:14|INFO |th=00007F636E79A680|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 3-th file done
2023/09/08 21:09:14|INFO |th=00007F636DF91AC0|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 4-th file done
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/overlaybd/lsmt/file.cpp:1581|load_merge_index:load index failed.
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/image_file.cpp:351|open_lowers:LSMT::open_files_ro(files, 6, 1) return NULL
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/image_file.cpp:472|init_image_file:open lower layer failed.
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/main.cpp:326|dev_open:create image file failed
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/build/_deps/tcmu-src/libtcmu.cpp:605|device_add:handler open failed for uio4

More logs:

2023/09/08 21:09:13|INFO |th=00007F636EFA4EC0|/src/src/overlaybd/tar/tar_file.cpp:346|open_tar_file:open file as tar file
2023/09/08 21:09:13|INFO |th=00007F636EFA4EC0|/src/src/overlaybd/tar/tar_file.cpp:97|read_header:[m_size=5540][base_offset=1536]
2023/09/08 21:09:13|INFO |th=00007F636EFA4EC0|/src/src/overlaybd/zfile/zfile.cpp:715|load_jump_table:read overwrite header. idx_offset: 4980, idx_bytes: 48, dict_size: 0, use_dict: 0
2023/09/08 21:09:13|INFO |th=00007F636EFA4EC0|/src/src/overlaybd/zfile/zfile.cpp:209|build:create jump table done. {part_count: 1, deltas_count: 13, size: 34}
2023/09/08 21:09:13|INFO |th=00007F636EFA4EC0|/src/src/overlaybd/zfile/compressor.cpp:352|create_compressor:ZFileObject using LZ4 algorithm
2023/09/08 21:09:13|INFO |th=00007F636EFA4EC0|/src/src/switch_file.cpp:50|try_open_zfile:open file as zfile format, path: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:3ed48643f0124b05c77afb4dbe5b19b0c32576358cf8604e62dd9c65cace9894
2023/09/08 21:09:13|INFO |th=00007F636D790700|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:a288b8d581fa0ac634e94afcfffe6fac2abbfac3162efd7cfcefa78ca8e71fd6, time used: 1634 ms
2023/09/08 21:09:13|INFO |th=00007F636DF956C0|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:341450c643af7a84e04a59eb566e005f2751d225bdf76f10ccaa19f461bd5164, time used: 1640 ms
2023/09/08 21:09:13|INFO |th=00007F636F7ACB00|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:1b904899f31625f43c91f57668f2fbda7857c19870c13aeb8a1a7f29bc289724, time used: 1649 ms
2023/09/08 21:09:13|INFO |th=00007F636FFB3F00|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:215176c2bffa3bc0ee86165299d7b79da4a5d3cdd02096de4076890cb386ee90, time used: 1649 ms
2023/09/08 21:09:13|INFO |th=00007F636D790700|/src/src/overlaybd/tar/tar_file.cpp:346|open_tar_file:open file as tar file
2023/09/08 21:09:13|INFO |th=00007F636D790700|/src/src/overlaybd/tar/tar_file.cpp:97|read_header:[m_size=919305045][base_offset=1536]
2023/09/08 21:09:13|INFO |th=00007F636D790700|/src/src/overlaybd/zfile/zfile.cpp:715|load_jump_table:read overwrite header. idx_offset: 918348329, idx_bytes: 956204, dict_size: 0, use_dict: 0
2023/09/08 21:09:13|ERROR|th=00007F636DF956C0|/src/src/overlaybd/registryfs/registryfs.cpp:166|GET:Failed to fetch data [ret=503][url=https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:341450c643af7a84e04a59eb566e005f2751d225bdf76f10ccaa19f461bd5164]
2023/09/08 21:09:13|INFO |th=00007F636F7ACB00|/src/src/overlaybd/tar/tar_file.cpp:346|open_tar_file:open file as tar file
2023/09/08 21:09:13|INFO |th=00007F636F7ACB00|/src/src/overlaybd/tar/tar_file.cpp:97|read_header:[m_size=6065][base_offset=1536]
2023/09/08 21:09:13|INFO |th=00007F636F7ACB00|/src/src/overlaybd/zfile/zfile.cpp:715|load_jump_table:read overwrite header. idx_offset: 5505, idx_bytes: 48, dict_size: 0, use_dict: 0
2023/09/08 21:09:13|INFO |th=00007F636F7ACB00|/src/src/overlaybd/zfile/zfile.cpp:209|build:create jump table done. {part_count: 1, deltas_count: 13, size: 34}
2023/09/08 21:09:13|INFO |th=00007F636F7ACB00|/src/src/overlaybd/zfile/compressor.cpp:352|create_compressor:ZFileObject using LZ4 algorithm
2023/09/08 21:09:13|INFO |th=00007F636F7ACB00|/src/src/switch_file.cpp:50|try_open_zfile:open file as zfile format, path: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:1b904899f31625f43c91f57668f2fbda7857c19870c13aeb8a1a7f29bc289724
2023/09/08 21:09:13|INFO |th=00007F636D790700|/src/src/overlaybd/zfile/zfile.cpp:209|build:create jump table done. {part_count: 14941, deltas_count: 239052, size: 597632}
2023/09/08 21:09:13|INFO |th=00007F636D790700|/src/src/overlaybd/zfile/compressor.cpp:352|create_compressor:ZFileObject using LZ4 algorithm
2023/09/08 21:09:13|INFO |th=00007F636D790700|/src/src/switch_file.cpp:50|try_open_zfile:open file as zfile format, path: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:a288b8d581fa0ac634e94afcfffe6fac2abbfac3162efd7cfcefa78ca8e71fd6
2023/09/08 21:09:13|INFO |th=00007F636FFB3F00|/src/src/overlaybd/tar/tar_file.cpp:346|open_tar_file:open file as tar file
2023/09/08 21:09:13|INFO |th=00007F636FFB3F00|/src/src/overlaybd/tar/tar_file.cpp:97|read_header:[m_size=45051914][base_offset=1536]
2023/09/08 21:09:13|INFO |th=00007F636FFB3F00|/src/src/overlaybd/zfile/zfile.cpp:715|load_jump_table:read overwrite header. idx_offset: 44965878, idx_bytes: 85524, dict_size: 0, use_dict: 0
2023/09/08 21:09:13|INFO |th=00007F636FFB3F00|/src/src/overlaybd/zfile/zfile.cpp:209|build:create jump table done. {part_count: 1337, deltas_count: 21382, size: 53460}
2023/09/08 21:09:13|INFO |th=00007F636FFB3F00|/src/src/overlaybd/zfile/compressor.cpp:352|create_compressor:ZFileObject using LZ4 algorithm
2023/09/08 21:09:13|INFO |th=00007F636FFB3F00|/src/src/switch_file.cpp:50|try_open_zfile:open file as zfile format, path: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:215176c2bffa3bc0ee86165299d7b79da4a5d3cdd02096de4076890cb386ee90
2023/09/08 21:09:13|INFO |th=00007F6371BBFF80|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-1}/blobs/sha256:74f42b7e6898715ced940d8631bbd692f91b878c1c7fbe9dda32d20388988fcd, time used: 1463 ms
2023/09/08 21:09:13|ERROR|th=00007F6371BBFF80|/src/src/overlaybd/registryfs/registryfs.cpp:166|GET:Failed to fetch data [ret=503][url=https://{registry-dns}/v2/{image-repo-1}/blobs/sha256:74f42b7e6898715ced940d8631bbd692f91b878c1c7fbe9dda32d20388988fcd]
2023/09/08 21:09:13|WARN |th=00007F6371BBFF80|/src/src/overlaybd/registryfs/registryfs.cpp:426|preadv:failed to perform HTTP GET, going to retry [code=503][offset=1012924416][count=1048576][ret_len=0]errno=125(Operation canceled)
2023/09/08 21:09:13|INFO |th=00007F636E79DF00|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:9947142313d0b916832fa60f98909a87350a14d544a4d10c42edf42b35a35dd0, time used: 1880 ms
2023/09/08 21:09:13|INFO |th=00007F6372BC7B40|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-1}/blobs/sha256:88e371907d4195e5d00831652f1f1dee867dca893315221a743fa630ede32ab3, time used: 1545 ms
2023/09/08 21:09:13|ERROR|th=00007F636E79DF00|/src/src/overlaybd/registryfs/registryfs.cpp:166|GET:Failed to fetch data [ret=503][url=https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:9947142313d0b916832fa60f98909a87350a14d544a4d10c42edf42b35a35dd0]
2023/09/08 21:09:14|INFO |th=00007F636DF956C0|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:341450c643af7a84e04a59eb566e005f2751d225bdf76f10ccaa19f461bd5164, time used: 1196 ms
2023/09/08 21:09:14|INFO |th=00007F636DF956C0|/src/src/overlaybd/tar/tar_file.cpp:346|open_tar_file:open file as tar file
2023/09/08 21:09:14|INFO |th=00007F636DF956C0|/src/src/overlaybd/tar/tar_file.cpp:97|read_header:[m_size=29963471][base_offset=1536]
2023/09/08 21:09:14|INFO |th=00007F636DF956C0|/src/src/overlaybd/zfile/zfile.cpp:715|load_jump_table:read overwrite header. idx_offset: 29901363, idx_bytes: 61596, dict_size: 0, use_dict: 0
2023/09/08 21:09:14|INFO |th=00007F636DF956C0|/src/src/overlaybd/zfile/zfile.cpp:209|build:create jump table done. {part_count: 963, deltas_count: 15400, size: 38504}
2023/09/08 21:09:14|INFO |th=00007F636DF956C0|/src/src/overlaybd/zfile/compressor.cpp:352|create_compressor:ZFileObject using LZ4 algorithm
2023/09/08 21:09:14|INFO |th=00007F636DF956C0|/src/src/switch_file.cpp:50|try_open_zfile:open file as zfile format, path: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:341450c643af7a84e04a59eb566e005f2751d225bdf76f10ccaa19f461bd5164
2023/09/08 21:09:14|INFO |th=00007F6371BBFF80|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-1}/blobs/sha256:74f42b7e6898715ced940d8631bbd692f91b878c1c7fbe9dda32d20388988fcd, time used: 1165 ms
2023/09/08 21:09:14|INFO |th=00007F636E79DF00|/src/src/overlaybd/registryfs/registryfs.cpp:198|operator():getActualUrl for: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:9947142313d0b916832fa60f98909a87350a14d544a4d10c42edf42b35a35dd0, time used: 1151 ms
2023/09/08 21:09:14|INFO |th=00007F636E79DF00|/src/src/overlaybd/tar/tar_file.cpp:346|open_tar_file:open file as tar file
2023/09/08 21:09:14|INFO |th=00007F636E79DF00|/src/src/overlaybd/tar/tar_file.cpp:97|read_header:[m_size=340763184][base_offset=1536]
2023/09/08 21:09:14|INFO |th=00007F636E79DF00|/src/src/overlaybd/zfile/zfile.cpp:715|load_jump_table:read overwrite header. idx_offset: 340295624, idx_bytes: 467048, dict_size: 0, use_dict: 0
2023/09/08 21:09:14|INFO |th=00007F636E79DF00|/src/src/overlaybd/zfile/zfile.cpp:209|build:create jump table done. {part_count: 7298, deltas_count: 116763, size: 291910}
2023/09/08 21:09:14|INFO |th=00007F636E79DF00|/src/src/overlaybd/zfile/compressor.cpp:352|create_compressor:ZFileObject using LZ4 algorithm
2023/09/08 21:09:14|INFO |th=00007F636E79DF00|/src/src/switch_file.cpp:50|try_open_zfile:open file as zfile format, path: https://{registry-dns}/v2/{image-repo-2}/blobs/sha256:9947142313d0b916832fa60f98909a87350a14d544a4d10c42edf42b35a35dd0
2023/09/08 21:09:14|INFO |th=00007F636FFB5300|/src/src/overlaybd/lsmt/file.cpp:1531|do_parallel_load_index:check 0-th file is normal file or LSMT file
2023/09/08 21:09:14|INFO |th=00007F636F7AC680|/src/src/overlaybd/lsmt/file.cpp:1531|do_parallel_load_index:check 1-th file is normal file or LSMT file
2023/09/08 21:09:14|INFO |th=00007F636EFA3280|/src/src/overlaybd/lsmt/file.cpp:1531|do_parallel_load_index:check 2-th file is normal file or LSMT file
2023/09/08 21:09:14|INFO |th=00007F636E79A680|/src/src/overlaybd/lsmt/file.cpp:1531|do_parallel_load_index:check 3-th file is normal file or LSMT file
2023/09/08 21:09:14|INFO |th=00007F636DF91AC0|/src/src/overlaybd/lsmt/file.cpp:1531|do_parallel_load_index:check 4-th file is normal file or LSMT file
2023/09/08 21:09:14|INFO |th=00007F636C77D280|/src/src/overlaybd/lsmt/file.cpp:1531|do_parallel_load_index:check 5-th file is normal file or LSMT file
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/zfile/zfile.cpp:355|get_current_block:inner buffer offset (0) + compressed size (18446744073709551612) overflow. errno=38(Function not implemented)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/zfile/zfile.cpp:499|pread:read compressed data failed. errno=38(Function not implemented)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/lsmt/file.cpp:1169|verify_ht:failed to read file trailer. errno=34(Numerical result out of range)
2023/09/08 21:09:14|ERROR|th=00007F636C77D280|/src/src/overlaybd/lsmt/file.cpp:1552|do_parallel_load_index:failed to load index from 5-th file
2023/09/08 21:09:14|INFO |th=00007F636FFB5300|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 0-th file done
2023/09/08 21:09:14|INFO |th=00007F636F7AC680|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 1-th file done
2023/09/08 21:09:14|INFO |th=00007F636EFA3280|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 2-th file done
2023/09/08 21:09:14|INFO |th=00007F636E79A680|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 3-th file done
2023/09/08 21:09:14|INFO |th=00007F636DF91AC0|/src/src/overlaybd/lsmt/file.cpp:1563|do_parallel_load_index:load index from 4-th file done
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/overlaybd/lsmt/file.cpp:1581|load_merge_index:load index failed.
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/image_file.cpp:351|open_lowers:LSMT::open_files_ro(files, 6, 1) return NULL
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/image_file.cpp:472|init_image_file:open lower layer failed.
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/src/main.cpp:326|dev_open:create image file failed
2023/09/08 21:09:14|ERROR|th=00007F63A52B0B00|/src/build/_deps/tcmu-src/libtcmu.cpp:605|device_add:handler open failed for uio4
2023/09/08 21:09:14|ERROR|th=00007F6372BC7B40|/src/src/overlaybd/registryfs/registryfs.cpp:166|GET:Failed to fetch data [ret=503][url=https://{registry-dns}/v2/{image-repo-1}/blobs/sha256:7eb2359ae40cc7032f05d785df6f8569b5c8f7949bcc768fb17429549bf633f3]
2023/09/08 21:09:14|WARN |th=00007F6372BC7B40|/src/src/overlaybd/registryfs/registryfs.cpp:426|preadv:failed to perform HTTP GET, going to retry [code=503][offset=268435456][count=1048576][ret_len=0]errno=125(Operation canceled)
2023/09/08 21:09:14|ERROR|th=00007F636CF87AC0|/src/src/overlaybd/registryfs/registryfs.cpp:166|GET:Failed to fetch data [ret=503][url=https://{registry-dns}/v2/{image-repo-1}/blobs/sha256:17f5c98c26efcf1a9b4643593e83f2e692adf8be8e6d99b70b13dbae9d67021e]
2023/09/08 21:09:14|WARN |th=00007F636CF87AC0|/src/src/overlaybd/registryfs/registryfs.cpp:426|preadv:failed to perform HTTP GET, going to retry [code=503][offset=55574528][count=1048576][ret_len=0]errno=125(Operation canceled)
2023/09/08 21:09:14|INFO |th=00007F63A52B0B00|/src/src/main.cpp:316|dev_open:dev open /var/lib/overlaybd/snapshots/111/block/config.v1.json

What did you expect to happen?

No response

How can we reproduce it?

This issue seems to be reproduced when Blob storage returns 503s

What is the version of your Overlaybd?

0.6.15

What is your OS environment?

ubuntu

Are you willing to submit PRs to fix it?

yuchen0cc commented 1 year ago

So far, zfile spec lack a crc to verify its meta, i.e. header/trailer and index. The above issue happened when reading index through internet, and got correct length but incorrect data. And the incorrect data is cached by file cache, that will not be evict. Fix #263 enhance bound checking when building index, which will avoid 0-data error and most rand-data error. (For all overlaybd images) Adding crc for zfile is in schedule #262, which will totally avoid such issues. (For new version overlaybd images)