Hello, everyone:
I run this script with author's dataset well, but i get into into trouble like title when i train the model with my own dataset.
some pics of my dataset:
these pics are 30x500, 25 chars in each pic. i used about 260k of these to train, 65k to validate.
words in pics are randomly selected from some drug infos like this: with open('thistxt', 'r', encoding='utf-8') as f:# read each line into a listall_lines = f.read().split('\n').strip()# link each line to a stringdata_str = ''.join(all_lines)# generate word with random indexrand_word = data_str[a_rand_num, a_rand_num + word_length]
there are 196 unique chars in this txt, so my num_classes in the model is 196. is my dataset not large enough or what? i'd appreciate if anyone can help. 中文也可以
Hello, everyone: I run this script with author's dataset well, but i get into into trouble like title when i train the model with my own dataset. some pics of my dataset: these pics are 30x500, 25 chars in each pic. i used about 260k of these to train, 65k to validate. words in pics are randomly selected from some drug infos like this:
with open('thistxt', 'r', encoding='utf-8') as f:
# read each line into a list
all_lines = f.read().split('\n').strip()
# link each line to a string
data_str = ''.join(all_lines)
# generate word with random index
rand_word = data_str[a_rand_num, a_rand_num + word_length]
there are 196 unique chars in this txt, so mynum_classes
in the model is 196. is my dataset not large enough or what? i'd appreciate if anyone can help. 中文也可以