faustomorales / keras-ocr

A packaged and flexible version of the CRAFT text detector and Keras CRNN recognition model.
https://keras-ocr.readthedocs.io/
MIT License
1.38k stars 355 forks source link

Custom setting : Increase the amount of train data #109

Closed NeighborhoodCoding closed 4 years ago

NeighborhoodCoding commented 4 years ago

First, thank for your help. Now I'm able to train your keras_ocr in my custom korean characters and it works!

but Now, I want to increase amount of train data. In fact, the data is automatically generated by image_generators. Can I increase the number of train data?

the result of below function is 3/827 [..............................] - ETA: 15:44 - loss: 0.0099 may means the train data is 827ea? How can I increase it to 827 --> 10,000? or 100,000?


detector.model.fit_generator(
    generator=detection_train_generator,
    steps_per_epoch=math.ceil(len(background_splits[0]) / detector_batch_size),
    epochs=1,
    workers=0,
    callbacks=[
        tf.keras.callbacks.EarlyStopping(restore_best_weights=True, patience=5),
        tf.keras.callbacks.CSVLogger(f'{detector_basepath}.csv'),
        tf.keras.callbacks.ModelCheckpoint(filepath=f'{detector_basepath}.h5')
    ],
    validation_data=detection_val_generator,
    validation_steps=math.ceil(len(background_splits[1]) / detector_batch_size)
)
faustomorales commented 4 years ago

Data is being generated at random and so you can increase the number of steps in an epoch arbitrarily by increasing steps_per_epoch and validation_steps if you're not worried about overfitting.

Closing for the moment since this is not a bug.

NeighborhoodCoding commented 4 years ago

thanks for your help. maybe I should look Wikipedia crawler for random background images....