Closed wjx-git closed 5 years ago
原因就是由于Corrupted image导致的。 在创建lmdb格式数据时,需要将str类型数据转成bytes才行,所以我把原文中得到代码改成下面这样, def writeCache(env, cache): with env.begin(write=True) as txn: for k, v in cache.items():
if isinstance(v, bytes):
txn.put(k.encode(), v) # 添加数据和键值
elif isinstance(v, str):
txn.put(k.encode(), v.encode())
原来我的代码是: def writeCache(env, cache): with env.begin(write=True) as txn: for k, v in cache.items(): txn.put(k.encode(), str(v).encode()) # 添加数据和键值 但是V值有时候是bytes类型,有时候是str类型,上面做法会将bytes类型再次转换为bytes类型,导致后面训练时无法读取,才出现Corrupted image
Traceback (most recent call last): File "/home/ayg/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/ayg/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ayg/workspace/crnn/lib/dataset.py", line 59, in getitem
return self[index + 1]
File "/home/ayg/workspace/crnn/lib/dataset.py", line 59, in getitem
return self[index + 1]
File "/home/ayg/workspace/crnn/lib/dataset.py", line 59, in getitem
return self[index + 1]
[Previous line repeated 184 more times]
File "/home/ayg/workspace/crnn/lib/dataset.py", line 44, in getitem
assert index <= len(self), 'index range error'
AssertionError: index range error
在报错之前出现很多错误的图片: Corrupted image for 1628 Corrupted image for 1630 Corrupted image for 1632 Corrupted image for 1634 ...
错误信息来源: def _process_next_batch(self, batch): self.rcvd_idx += 1 self._put_indices() if isinstance(batch, _utils.ExceptionWrapper):
make multiline KeyError msg readable by working around
有人遇到这个问题吗?是否和Corrupted image 的出现有关?