juicedata / juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.
https://juicefs.com
Apache License 2.0
10.66k stars 930 forks source link

mysql:corrupt slices: len=xx[slice.go:120] #3959

Open Gengzp opened 1 year ago

Gengzp commented 1 year ago

What happened:

使用mysql+华为obs挂载后,日志发现 corrupt slices 等报错,无法对文件执行操作

What you expected to happen:

可以正常运行

How to reproduce it (as minimally and precisely as possible):

1. 申请华为obs对象存储,获取access_key、secret_key、bucket名称
2. 执行juicefs format命令,输入元数据引擎参数及obs参数
3. 执行juicefs mount命令,前台启动,非必选参数也设置过,但并没有解决错误,比如:--max-uploads=1 --buffer-size=64 --no-bgjob --get-timeout=180 --put-timeout=180 --io-retries=10
4. 启动写文件服务,该服务频繁将小文件(jpg、mp4、bin)写入挂载目录
5. 观察前台日志,1-3min内出现报错

Anything else we need to know?

  1. 以下报错的部分报错已从文档中查找到,加入相应的挂载参数仍然继续报错,corrupt slices未从文档中找到解决办法
  2. 将元数据引擎切换为redis后,不再发生报错,但数据量庞大,占用内存较高,目前还是想将mysql作为数据引擎,以下是截取到的错误:
    2023/08/02 14:16:45.628635 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:45.628698 juicefs[26602] <ERROR>: read file 172: input/output error [reader.go:122]
    2023/08/02 14:16:45.628734 juicefs[26602] <INFO>: slow operation: read (172,4096,0): input/output error (0) <13.589752> [accesslog.go:65]
    2023/08/02 14:16:45.675626 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:45.683281 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:45.991366 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:46.599765 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:47.033815 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:47.088113 juicefs[26602] <ERROR>: corrupt slices: len=306 [slice.go:120]
    2023/08/02 14:16:47.088170 juicefs[26602] <ERROR>: read file 170: input/output error [reader.go:122]
    2023/08/02 14:16:47.088215 juicefs[26602] <INFO>: slow operation: read (170,4096,0): input/output error (0) <13.591799> [accesslog.go:65]
    2023/08/02 14:16:47.134834 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:47.142413 juicefs[26602] <ERROR>: corrupt slices: len=26 [slice.go:120]
    2023/08/02 14:16:47.367520 juicefs[26602] <WARNING>: GET chunks/1029714/1029714018/1029714018304_0_524288: obs: service returned error: Status=404 Not Found, Code=NoSuchKey, Message=The specified key does not exist., RequestId=00000189B4E44602C0881C4BB99B85C8; retrying [cached_store.go:591]
    2023/08/02 14:16:47.387229 juicefs[26602] <WARNING>: fail to read sliceId 1029714018304 (off:0, size:96, clen: 524288): get chunks/1029714/1029714018/1029714018304_0_524288: obs: service returned error: Status=404 Not Found, Code=NoSuchKey, Message=The specified key does not exist., RequestId=00000189B4E44A10C08932DFB953AF13 [reader.go:799]

Environment:

zhijian-pro commented 1 year ago
  1. 启动写文件服务,该服务频繁将小文件(jpg、mp4、bin)写入挂载目录

Can you provide us with specific details to facilitate the reproduction? I did not reproduce the bug.

zhijian-pro commented 1 year ago

The type of MySQL used for testing:

image
select version()
image
SHOW VARIABLES LIKE '%character_set%';
image
show variables like 'transaction%';
image

Test steps

  1. mount on jfs
  2. juicefs bench /jfs --big-file-size 0 --small-file-count 10000 -p 4
  3. The corrupt slices error was not found
  4. cd /jfs && git clone https://github.com/juicedata/juicefs.git && cd juicefs
  5. git checkout release-1.0
  6. make
  7. The corrupt slices error was not found
  8. git checkout main
  9. make
  10. The corrupt slices error was not found
Gengzp commented 1 year ago
  1. 启动写文件服务,该服务频繁将小文件(jpg、mp4、bin)写入挂载目录

Can you provide us with specific details to facilitate the reproduction? I did not reproduce the bug. 26881bef-9d1a-401e-930d-db5986c96a59 这是我们的mysql配置

show variables like 'transaction%';
+----------------------------------+----------------+
| Variable_name                    | Value          |
+----------------------------------+----------------+
| transaction_alloc_block_size     | 8192           |
| transaction_allow_batching       | OFF            |
| transaction_isolation            | READ-COMMITTED |
| transaction_prealloc_size        | 4096           |
| transaction_read_only            | OFF            |
| transaction_write_set_extraction | XXHASH64       |
+----------------------------------+----------------+

如果还是复现不了,那可能是我们的使用问题,目前使用postgresql+obs不会出现问题,我们已经将元数据引擎切换到了PG