LBH1024 / CAN

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition (ECCV’2022 Poster).
MIT License
364 stars 59 forks source link

在HME100K数据集上ExpRate与论文中相差5% #24

Closed Howrunz closed 1 year ago

Howrunz commented 1 year ago





  1. 我注意到您在#19中提供了在HME100K数据集上训练的主要参数,我使用的主要参数如下,其它参数与config.yaml一致:
    seed: 20211024

epochs: 90 batch_size: 8 workers: 8 train_parts: 3 valid_parts: 1 valid_start: 0 save_start: 0

optimizer: Adadelta lr: 1 lr_decay: cosine step_ratio: 10 step_decay: 5 eps: 1e-6 weight_decay: 2e-5 beta: 0.9

counting_decoder: in_channel: 684 out_channel: 247

2. 训练时未对图片做Resize操作;
3. 训练所使用的GPU为Nvidia Tesla V100 (32GB),单张。
4. Train loss:
5. Train ExpRate:
6. Eval loss:
7. Eval ExpRate:


LBH1024 commented 1 year ago


Howrunz commented 1 year ago


pinkal21300 commented 1 year ago

@LBH1024 请与我分享 HME100K 数据集,因为我无法下载它。我的邮箱 太感谢了。

pinkal21300 commented 1 year ago

@LBH1024 太感谢了。

SuperHHzy commented 1 year ago

你好,我想请问一下,你的字典大小为247,也就需要把符号种类改成247。我这个数据集字典大小为558,我将符号种类改成558之后,得到了KeyError: '('这样的报错,你知道是什么原因吗?

Howrunz commented 1 year ago

你好,我想请问一下,你的字典大小为247,也就需要把符号种类改成247。我这个数据集字典大小为558,我将符号种类改成558之后,得到了KeyError: '('这样的报错,你知道是什么原因吗?


SuperHHzy commented 1 year ago

我检查了我的word_dict.txt文件里是有(这个符号的,我想请问一下你的word_dict.txt文件的格式是这样吗? token 1 token 2 token 3 ......

Howrunz commented 1 year ago

我检查了我的word_dict.txt文件里是有(这个符号的,我想请问一下你的word_dict.txt文件的格式是这样吗? token 1 token 2 token 3 ......




class Words:
    def __init__(self, words_path):
        with open(words_path) as f:
            words = f.readlines()
            print(f'共 {len(words)} 类符号。')
        self.words_dict = {words[i].strip(): i for i in range(len(words))}  # 加断点
        self.words_index_dict = {i: words[i].strip() for i in range(len(words))}

    def __len__(self):
        return len(self.words_dict)

    def encode(self, labels):
        label_index = [self.words_dict[item] for item in labels]  # 加断点
        return label_index

可以在# 加断点注释处debug。

mrkruk5 commented 1 year ago

Hi @Howrunz, were you able to reproduce the results in the paper on the HME100K dataset after resizing the images to a height of 120? Did you perform any other pre-processing steps like the ones I've mentioned here? Thanks.

SuperHHzy commented 1 year ago

我检查了我的word_dict.txt文件里是有(这个符号的,我想请问一下你的word_dict.txt文件的格式是这样吗? token 1 token 2 token 3 ......




class Words:
    def __init__(self, words_path):
        with open(words_path) as f:
            words = f.readlines()
            print(f'共 {len(words)} 类符号。')
        self.words_dict = {words[i].strip(): i for i in range(len(words))}  # 加断点
        self.words_index_dict = {i: words[i].strip() for i in range(len(words))}

    def __len__(self):
        return len(self.words_dict)

    def encode(self, labels):
        label_index = [self.words_dict[item] for item in labels]  # 加断点
        return label_index

可以在# 加断点注释处debug。


Howrunz commented 1 year ago

Hi @Howrunz, were you able to reproduce the results in the paper on the HME100K dataset after resizing the images to a height of 120? Did you perform any other pre-processing steps like the ones I've mentioned here? Thanks.

Hi @mrkruk5 . By adjusting the height to 120 and keeping the aspect ratio, the reproduced ExpRate reached 67.66%. The input image is converted to grayscale, otherwise consistent with the author's code.

Howrunz commented 1 year ago


  1. .pkl文件实际为一个字典,键为图像名,值为图像像素矩阵。
  2. 预处理就是将图像转为灰度图,在保持宽高比的基础上缩放高度到120。
SuperHHzy commented 1 year ago


  1. .pkl文件实际为一个字典,键为图像名,值为图像像素矩阵。
  2. 预处理就是将图像转为灰度图,在保持宽高比的基础上缩放高度到120。




mrkruk5 commented 1 year ago

Hi @Howrunz, were you able to reproduce the results in the paper on the HME100K dataset after resizing the images to a height of 120? Did you perform any other pre-processing steps like the ones I've mentioned here? Thanks.

Hi @mrkruk5 . By adjusting the height to 120 and keeping the aspect ratio, the reproduced ExpRate reached 67.66%. The input image is converted to grayscale, otherwise consistent with the author's code.

Hi @Howrunz, thank you for confirming the results and preprocessing steps. Did you pad the lower right corner of the images to the max batch width for batch processing during training on the HME100K dataset? Do you also have results of the model trained on HME100K tested on the CROHME datasets?

Howrunz commented 1 year ago

Hi @Howrunz, thank you for confirming the results and preprocessing steps. Did you pad the lower right corner of the images to the max batch width for batch processing during training on the HME100K dataset? Do you also have results of the model trained on HME100K tested on the CROHME datasets?

  1. That's right, the maximum width of each batch is not necessarily the same, see the author's code here for details.
  2. Sorry, I haven't had time to do it yet.
Howrunz commented 1 year ago





SuperHHzy commented 1 year ago

[[[很抱歉又来打扰您,我将图像转为灰度图,在保持宽高比的基础上缩放高度到120的方式对图像进行了预处理,在使用WAP这里的方法生成了pkl文件。然后我将train_image_path和train_label_path修改了之后,代码出现下面这个错误。 image 不知道您方便吗?如果方便的话可以留下您的邮箱,以便我们更好的交流。

