Open houxiyao opened 2 years ago
Pandas通过fsspec上传大文件是触发了PositionNotEqualToLength
Excuse me, could you please provide some more details? how to reproduce this error? Uploading large files, and what does the Pandas do here?
I use fsspec with zstd compressor, and encounter the same issues when the object size raises to 5M
.
python=3.10
and ossfs=2023.6.0
import json
import fsspec
import ossfs
import sys
#import ossfs.base as ob
#ob.DEFAULT_BLOCK_SIZE = 5 * 2 ** 30
import random
import string
N = 2**20
res = ''.join(random.choices(string.ascii_uppercase + string.digits, k=N))
print(sys.getsizeof(res) / 2**20)
with fsspec.open("oss://personal-cn/junfeng/t1.jsonl.zst", mode="wb", compression="zstd", encoding="utf-8") as f:
lines = ''
for i in range(100):
line = ''.join(random.choices(string.ascii_uppercase + string.digits, k=N))
lines += line
f.write(line.encode("utf-8"))
print(i, sys.getsizeof(lines) / 2**20)
OSError: [Errno 5] {'status': 409, 'x-oss-request-id': '6541DDD3D832763531ABC4C5', 'details': {'Code': 'PositionNotEqualToLength', 'Message': 'Position is not equal to file length', 'RequestId': '6541DDD3D832763531ABC4C5', 'HostId': 'personal-cn.oss-cn-wulanchabu-internal.aliyuncs.com', 'EC': '0026-00000016', 'RecommendDoc': 'https://api.aliyun.com/troubleshoot?q=0026-00000016'}}
Last, the size of the stored file is about 5.1M.
When changing self.loc
to self.offset
it would be ok.
When changing
self.loc
toself.offset
it would be ok.
it looks like the same error with issue 127
https://github.com/fsspec/filesystem_spec/blob/73de5045f95257af153769984312c1a4875087b4/fsspec/spec.py#L1491-L1495 https://github.com/fsspec/ossfs/blob/d1db90caee05a359fcd2e59eb3a58f5755495dbf/ossfs/core.py#L746-L748