python3环境执行crnn_main.py报错

beimingmaster commented 6 years ago

在制作lmdb时，tolmdb.py 28行报错： TypeError: Won't implicitly convert Unicode to bytes; use .encode()

25 def writeCache(env, cache): 26 with env.begin(write=True) as txn: 27 for k, v in cache.items(): 28 txn.put(k, v)

28修改成：txn.put(str(K).encode('utf-8'), str(v).encode('utf-8')) 可以继续执行，但是crnn_main.py训练时报错： Corrupted image for 2488925 Corrupted image for 3213999 Corrupted image for 3214001 Corrupted image for 2488927 Corrupted image for 2488929 Corrupted image for 3214003 Corrupted image for 2488931 Corrupted image for 3214005 Corrupted image for 2488933 Corrupted image for 3214007 Corrupted image for 2488935 Corrupted image for 3214009 Corrupted image for 2488937 Corrupted image for 3214011 Corrupted image for 2488939

haneSier commented 6 years ago

制作lmdb 用Python 2执行，制作的数据集就可以正常训练， Python3制作的会出问题

beimingmaster commented 6 years ago

制作lmdb 用Python 2执行，制作的数据集就可以正常训练， Python3制作的会出问题

但是我的环境已经是python3了，没法搞2套环境。

之前的错误我改了一下，现在没有报错了。

if isinstance(k, str):
   k = k.encode('utf-8')
if isinstance(v, str):
   v = v.encode('utf-8')
txn.put(k, v)

不过通过gpu训练又报新的错误：

python crnn_main.py --trainroot ./to_lmdb/train_lmdb --valroot ./to_lmdb/test_lmdb --cuda

File "/home/master/.virtualenvs/dl4cv/lib/python3.5/site-packages/torch/autograd/init.py", line 98, in backward variables, grad_variables, retain_graph) RuntimeError: expected Variable or None (got torch.cuda.FloatTensor)

Sierkinhane commented 6 years ago

我还没遇到过这样的问题，管理python2、python3可以用anconda管理

Sierkinhane commented 6 years ago

晚点我测试一下

haneSier commented 6 years ago

@beimingmaster 用Python3制作都会出现这个问题，Python2就正常

beimingmaster commented 6 years ago

问题解决了，wrapctc0.1的代码与pytorch2.0不兼容导致。 pip show warpctc_pytorch Version: 0.1 Summary: PyTorch wrapper for warp-ctc Home-page: https://github.com/baidu-research/warp-ctc Author: Jared Casper, Sean Naren Author-email: jared.casper@baidu.com, sean.narenthiran@digitalreasoning.com License: Apache Location: /home/master/.virtualenvs/dl4cv/lib/python3.5/site-packages/warpctc_pytorch-0.1-py3.5-linux-x86_64.egg Requires: Required-by:

修改 /home/master/.virtualenvs/dl4cv/lib/python3.5/site-packages/warpctc_pytorch-0.1-py3.5-linux-x86_64.egg/pytorch_binding/warpctcpytorch/__init_\.py文件中关于backward()函数的返回值(51行)，修改后如下。

 49     @staticmethod
 50     def backward(ctx, grad_output):
 51         #return ctx.grads, None, None, None, None, None, None
 52         return torch.autograd.Variable(ctx.grads), None, None, None, None, None, None

haneSier commented 6 years ago

Good job

beimingmaster commented 6 years ago

Good job

是网上其他人给的方案。 https://blog.csdn.net/hzhj2007/article/details/81078248 https://discuss.pytorch.org/t/ctcloss-backward-path/13576

OKhyc commented 5 years ago

我这用python2运行 tolmdb.py 总是会报 no module named lmdb ，我Ubuntu装了 3.6 所有的模块应该都是装到了 python3.6里面，我应该如何让python2 成功运行这个代码呢

Sierkinhane commented 5 years ago

Python2环境下安装lmdb

OKhyc commented 5 years ago

@Sierkinhane 非常感谢我跑起来了， lmdb 会生成 data.mdb 以及 lock.mdb 都有什么作用呢，为什么会生成两个文件呢

OKhyc commented 5 years ago

@beimingmaster 验证集 valroot 这个参数路径是什么意思，应该设置成什么？

liamrhli commented 5 years ago

Python2环境下安装lmdb

在py2.7下做数据集，到1000张时就会报错UnicodeDecodeError: 'utf8' codec can't decode byte 0xbb in position 0: invalid start byte 即使少于1000张，在最后也会报错，请问您知道知怎么回事吗 @Sierkinhane

Sierkinhane / CRNN_Chinese_Characters_Rec

python3环境执行crnn_main.py报错 #6