Open kkailaasa opened 2 months ago
@VikParuchuri Hi could you please share some insights on this.
@kkailaasa I'm running compiled surya-ocr on RTX 3050 with VRAM 8GB and have a decent speed
@kkailaasa can you tell us how you do ?Thanks
i'm running compiled surya-ocr on RTX 3050 with VRAM 8GB and have a decent speed
@snowfluke can you tell us how you do ?Thanks
Hello, thanks for the good software. Before putting it into prod use I did a small test (below). I have linux with Nvidia 4090 card (24GB). It takes about 6.2 GB only and when processing saturates one CPU thread to 100% and the GPU shows between 0% and 1% load. One page recognition (recognized text is 19KB) takes 70 seconds. Detection goes fast, but recognition is pretty slow.
Loaded detection model vikp/surya_det3 on device cuda with dtype torch.float16 Loaded recognition model vikp/surya_rec2 on device cuda with dtype torch.float16 Using device: cuda Detecting bboxes: 100%|██████████████████████████████████████████| 1/1 [00:00<00:00, 2.79it/s] Recognizing Text: 100%|██████████████████████████████████████████| 1/1 [01:08<00:00, 68.17s/it]
Is it because recognition step requires more VRAM than I have? If so, can it be configured to use more CPU threads? I have a second (slower) GPU - P40, is it possible to configure it for example detection to use P40 and recognition to use 4090?
from PIL import Image from surya.ocr import run_ocr from surya.model.detection.model import load_model as load_det_model, load_processor as load_det_processor from surya.model.recognition.model import load_model as load_rec_model from surya.model.recognition.processor import load_processor as load_rec_processor import json from PIL import ImageDraw, ImageFont import torch
IMAGE_PATH = "scan.JPEG"
image = Image.open(IMAGE_PATH) langs = ["pl"] # Replace with your languages - optional but recommended det_processor, det_model = load_det_processor(), load_det_model() rec_model, rec_processor = load_rec_model(), load_rec_processor()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(f"Using device: {device}")
det_model = det_model.to(device) rec_model = rec_model.to(device)
start_time = time.time() predictions = run_ocr([image], [langs], det_model, det_processor, rec_model, rec_processor) end_time = time.time()
resolved git clone for some reason was extremely slow and pip install fixed the issue
Hello, I'm interested in using Surya OCR, but I have two systems with less VRAM than the default requirements (> 24 GB VRAM):
From my reading of the project description, I understand that Surya can potentially run with lower VRAM by adjusting batch sizes.
Thank you for your help and for creating Surya OCR.