Calamari-OCR / calamari

Line based ATR Engine based on OCRopy
GNU General Public License v3.0
1.05k stars 209 forks source link

Can calamari predict works using RawDataSet class? #17

Closed ghost closed 6 years ago

ghost commented 6 years ago

What I want to do is realtime ocr(using calamari)

So, I tried to predict some images from memory(not saved image files)

I found calamari has two types of datasets(FileDataSet and RawDataSet)

During prediction, calamari use FileDataSet now.

I guess if I use RawDataSet class on the process of prediction, It works what I intended.

So, Can calamari predict works well using RawDataSet class???

Can give me any advices about that issue?

ChWick commented 6 years ago

Yes, RawDataSet as Input data is working, follow the steps in scripts/predict.py to create a Predictor (or MultiPredictor for voting) and feed it with a RawDataSet. Best you do:

  1. Create a network: backend = create_backend_from_proto; network = backend.create_net (compare predictor.py line 86 ff)
  2. For each chunk of data create a RawDataSet(rawData) and a Predictor(use your previous create network to instantiate: Predictor(network=network) and omit the checkout (the network gets created ones and is reused)
  3. call predictor.predict_dataset(your_raw_dataset)
ghost commented 6 years ago

@ChWick Thanks! It works.