cvlab-stonybrook / PaperEdge

The code and the DIW dataset for "Learning From Documents in the Wild to Improve Document Unwarping" (SIGGRAPH 2022)
MIT License
121 stars 23 forks source link

Question about training Tnet model only using real data #3

Open Sanster opened 2 years ago

Sanster commented 2 years ago

Have you ever tried to train Tnet only using real data(using unsupervised training)? I am curious if it is possible to converge. Thanks

wkema commented 2 years ago

This is a good idea. Tnet could converge with some regularizers. Unfortunately, Tnet has no idea what a flat document looks like if you only use real data for unsupervised training.

In my experiment, all the input images were "unwarped" to a barrel-like distortion. I won't be surprised if it converges to other distortions lol. image

We added a lot of tricks to make it work finally but the quantitively results could not even match the model trained on the synthetic data only.

I would be interested/excited to see if any unsupervised methods could achieve better results :)

Sanster commented 2 years ago

Thank you for sharing the experiment detail! I am currently trying to implement the document dewarping method to achieve the following effect (recorded from https://www.textin.com/experience/text_auto_removal)

https://user-images.githubusercontent.com/3998421/185350667-931bbd43-57b4-456f-9361-d2b627a051e3.mov

Among several methods, PaperEdge can get very good results

PaperEdge DDCP docTr
image image image
Sanster commented 2 years ago

As far as I know, maybe the closest approach to self-supervision is Fourier Document Restoration for Robust Document Dewarping and Recognition. Although it open-sources the dataset, unfortunately, the authors do not have the open source code.

wkema commented 2 years ago

As far as I know, maybe the closest approach to self-supervision is Fourier Document Restoration for Robust Document Dewarping and Recognition. Although it open-sources the dataset, unfortunately, the authors do not have the open source code.

lol yeah I read that paper. The ideas are very similar. Might be a concurrent work lol.

hanquansanren commented 2 years ago

I have tried to reproduce FDRNet recently, but it seems hard to converge.

As far as I know, maybe the closest approach to self-supervision is Fourier Document Restoration for Robust Document Dewarping and Recognition. Although it open-sources the dataset, unfortunately, the authors do not have the open source code.

zbzzz commented 2 years ago

您好,请问docunet数据集您那里有吗

wkema commented 2 years ago

您好,请问docunet数据集您那里有吗

Sorry I just found the data server in my previous lab had been down...so neither the DocUNet benchmark nor the Doc3D dataset is inaccessible.

If you just need the benchmark dataset, I have a backup copy on google drive: scan.zip https://drive.google.com/file/d/1IxeS8wwwXQUBt6grcUcNoszL2UyHCSBb/view?usp=sharing crop.zip https://drive.google.com/file/d/1w5_eimkpS2lpB9w-XKc8uKby5GDN8NIf/view?usp=share_link eval.zip https://drive.google.com/file/d/1RpjNxTF6hg2lv65qiYRWfsy9UGNFgal0/view?usp=share_link

As to the Doc3D dataset, it is too large to put on gdrive....I am not sure when the data server will be back online... sorry for the inconvenience.

ZhangXueBang commented 2 years ago

感谢您分享实验细节!我目前正在尝试实现文档去畸变方法以达到以下效果(记录自https://www.textin.com/experience/text_auto_removal)

2022-08-18.2.07.59.mov 在几种方法中,PaperEdge可以获得非常好的结果

纸边 DDCP 文档 图像 图像 图像

您好,很抱歉打扰。我刚刚做这个项目,但是作者实验室的服务器关闭了,无法访问 Doc3D 数据集。看到评论区您有做过这个项目,所以想问一下您还有没有数据集的保存,如果有的话,希望能够得到分享非常感谢

hanquansanren commented 2 years ago

该数据集非常庞大,大约有1TB,通过网络传输非常困难,如果你在中国大陆,或许我可以线下分享给你们

Sanster commented 2 years ago

Doc3D 的数据太大了。。。还是等作者服务器恢复吧

ZhangXueBang commented 2 years ago

该数据集非常庞大,大约有1TB,通过网络传输非常困难,如果你在中国大陆,或许我可以线下分享给你们

嗯嗯嗯,确实太大了,谢谢您的回复。我在中国大连,线下还是太麻烦您了,就不用了。我看到上面作者有回复benchmark dataset,不知道用这个数据集还有lDIW数据集能不能运行这个项目呢。

ZhangXueBang commented 2 years ago

Doc3D 的数据太大了。。。还是等作者服务器恢复吧

嗯嗯嗯,好的,谢谢您的回复。

ZhangXueBang commented 2 years ago

Doc3D 的数据太大了。。。还是等作者服务器恢复吧

您好,能问一下文档bgtex.txt中的图片对应的是数据集中的那一部分吗 /nfs/bigretina/kema/data/dtd/images/perforated/perforated_0103.jpg /nfs/bigretina/kema/data/dtd/images/perforated/perforated_0089.jpg /nfs/bigretina/kema/data/dtd/images/perforated/perforated_0015.jpg /nfs/bigretina/kema/data/dtd/images/perforated/perforated_0069.jpg /nfs/bigretina/kema/data/dtd/images/perforated/perforated_0144.jpg 进行训练的时候一直在报路径的错误

zbzzz commented 1 year ago

Doc3D 的数据太大了。。。还是等作者服务器恢复吧

您好,请问一下,数据量这么大,你们是怎样下载下来的,电脑的存储容量不够啊,还有就是能不能用其他数据集代替呢

Sanster commented 1 year ago

Doc3D 的数据太大了。。。还是等作者服务器恢复吧

您好,请问一下,数据量这么大,你们是怎样下载下来的,电脑的存储容量不够啊,还有就是能不能用其他数据集代替呢

原来就是通过作者的服务器下载的,据我所知没有这么全的数据集了,另一个选择是使用作者的代码自己生成 https://github.com/sagniklp/doc3D-renderer

zbzzz commented 1 year ago

Doc3D 的数据太大了。。。还是等作者服务器恢复吧

您好,请问一下,数据量这么大,你们是怎样下载下来的,电脑的存储容量不够啊,还有就是能不能用其他数据集代替呢

原来就是通过作者的服务器下载的,据我所知没有这么全的数据集了,另一个选择是使用作者的代码自己生成 https://github.com/sagniklp/doc3D-renderer

非常感谢您的回复,这个脚本里用到了一个bpy包,下载完blender后,在python里还是无法运行,想知道如何解决

erenxjw commented 1 year ago

我是一个学生,请帮助我一下,我想知道大家是如何自己训练出个作者已经训练好的那两个预模型,希望大家可以给我分享代码,感激不尽,本人邮箱erenxjw@163.com

leonodelee commented 1 year ago

Have you solved this?Also mentioned in https://github.com/cvlab-stonybrook/PaperEdge/issues/18#issue-1670243932

yy769405513 commented 1 year ago

该数据集非常庞大,大约有1TB,通过网络传输非常困难,如果你在中国大陆,或许我可以线下分享给你们

您好,如果您愿意分享这个数据集,我将十分感谢,我的坐标在杭州