zhiqwang / sightseq

Computer vision tools for fairseq, containing PyTorch implementation of text recognition and object detection
MIT License
125 stars 34 forks source link

How to use it just for testing one normal image? #1

Closed oneTaken closed 5 years ago

oneTaken commented 5 years ago

Don't know hot to use it.

zhiqwang commented 5 years ago

Hi oneTaken, you can refer test script for one image below.

import torch
import torchvision.transforms as transforms
from models.crnn import init_network
from datasets.datahelpers import default_loader
from utils.converter import LabelConverter

if __name__ == '__main__':

    img_name = './data/dev/00129_19515496090345.jpg'
    device = torch.device("cpu")
    model_path = './data/model/densenet_cifar.pth'

    alphabet = "0123456789"
    model_params = {}
    model_params['architecture'] = "densenet_cifar"
    model_params['num_classes'] = len(alphabet) + 1
    model_params['mean'] = [0.396, 0.576, 0.562]
    model_params['std'] = [0.154, 0.128, 0.130]
    model = init_network(model_params)
    model = model.to(device)

    # load checkpoint
    checkpoint = torch.load(model_path)
    model.load_state_dict(checkpoint['state_dict'])

    converter = LabelConverter(alphabet)

    transform = transforms.Compose([
        transforms.Resize((32, 200)),
        transforms.ToTensor(),
        transforms.Normalize(mean=model.meta['mean'], std=model.meta['std']),
    ])
    img = default_loader(img_name)
    img = transform(img)
    img = img.unsqueeze(0)

    log_probs = model(img)
    preds = converter.best_path_decode(log_probs)

    print(preds)
oneTaken commented 5 years ago

Thanks a lot.

And I do not find the model path in the repo. Can you provide your pre-trained model to let me dive into your code?

Thanks very much.

zhiqwang commented 5 years ago

Hi oneTaken, Thanks for your attention. My previous model was trained on a private datasets. Now I am training a public datasets with Chinese character, and I will push the model in GitHub when I am done.

oneTaken commented 5 years ago

So sad to hear this.

zhiqwang commented 5 years ago

This work will be completed about in 1 day.

zhiqwang commented 5 years ago

Hi @oneTaken, I have updated the readme, load the pre-trained model, and make some difference in the training strategy, you can refer to the last commit.