qingzhenduyu / ICAL

Official implementation for ICDAR 2024 Oral paper "ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition"
16 stars 0 forks source link

关于训练集的生成 #4

Closed Razliublt closed 1 month ago

Razliublt commented 3 months ago

我想自己创建一个训练集,请问能够给出下图的数据是如何生成的吗?如果我有一些自制的jpg文件想做成一个如下图的训练集要怎么做? image

qingzhenduyu commented 1 month ago
import os
import numpy as np
import cv2
from tqdm import tqdm
import pickle

h = 120
folder = "your image path here"
def build_img_arr(fpath):
    img = cv2.imread(fpath, cv2.IMREAD_GRAYSCALE)
    scale_r = h / img.shape[0]
    img = cv2.resize(img, None, fx=scale_r, fy=scale_r, interpolation=cv2.INTER_AREA)
    return img
d = dict()
for fname in tqdm(os.listdir(folder)):
    fpath = os.path.join(folder, fname)
    img = build_img_arr(fpath)
    d[fname] = img

你可以参考上述代码样例进行生成。