Pay20Y / SEED

163 stars 39 forks source link

If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you #15

Open DYF-AI opened 4 years ago

DYF-AI commented 4 years ago

If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you

DYF-AI commented 4 years ago

Can you provide a simple demo?thank you

kadirbeytorun commented 3 years ago

+1

Key-Lab-of-Intelligent-Robot-WIT commented 3 years ago

+1

lc1314555 commented 3 years ago

Thanks for your work! Can you provide a demo for recognition of single image?

Agiroy4712 commented 2 years ago

@DYF-AI @kadirbeytorun @lc1314555 1、the demo.py can be like this:`from future import absolute_import import sys sys.path.append('./')

import argparse import os os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

import os.path as osp import numpy as np import math import time

new

from PIL import Image, ImageFile

import torch from torch import nn, optim from torch.backends import cudnn from torch.utils.data import DataLoader, SubsetRandomSampler

new

from torchvision import transforms

from config import get_args from lib import datasets, evaluation_metrics, models from lib.models.model_builder import ModelBuilder from lib.datasets.dataset import LmdbDataset, AlignCollate, CustomDataset from lib.datasets.concatdataset import ConcatDataset from lib.loss import SequenceCrossEntropyLoss from lib.trainers import Trainer from lib.evaluators import Evaluator from lib.utils.logging import Logger, TFLogger from lib.utils.serialization import load_checkpoint, save_checkpoint from lib.utils.osutils import make_symlink_if_not_exists

new

from lib.evaluation_metrics.metrics import get_str_list from lib.utils.labelmaps import get_vocabulary, labels2strs

global_args = get_args(sys.argv[1:])

def image_process(image_path, imgH=32, imgW=100, keep_ratio=False, min_ratio=1): img = Image.open(image_path).convert('RGB')

if keep_ratio: w, h = img.size ratio = w / float(h) imgW = int(np.floor(ratio imgH)) imgW = max(imgH min_ratio, imgW)

img = img.resize((imgW, imgH), Image.BILINEAR) img = transforms.ToTensor()(img) img.sub(0.5).div(0.5)

return img

new

class DataInfo(object): """ Save the info about the dataset. This a code snippet from dataset.py """ def init(self, voc_type): super(DataInfo, self).init() self.voc_type = voc_type

assert voc_type in ['LOWERCASE', 'ALLCASES', 'ALLCASES_SYMBOLS']
self.EOS = 'EOS'
self.PADDING = 'PADDING'
self.UNKNOWN = 'UNKNOWN'
self.voc = get_vocabulary(voc_type, EOS=self.EOS, PADDING=self.PADDING, UNKNOWN=self.UNKNOWN)
self.char2id = dict(zip(self.voc, range(len(self.voc))))
self.id2char = dict(zip(range(len(self.voc)), self.voc))

self.rec_num_classes = len(self.voc)

def main(args): np.random.seed(args.seed) torch.manual_seed(args.seed) torch.cuda.manual_seed(args.seed) torch.cuda.manual_seed_all(args.seed) cudnn.benchmark = True torch.backends.cudnn.deterministic = True

args.cuda = args.cuda and torch.cuda.is_available() if args.cuda: print('using cuda.') torch.set_default_tensor_type('torch.cuda.FloatTensor') else: torch.set_default_tensor_type('torch.FloatTensor')

Create data loaders

if args.height is None or args.width is None: args.height, args.width = (32, 100)

dataset_info = DataInfo(args.voc_type)

Create model

model = ModelBuilder(arch=args.arch, rec_num_classes=dataset_info.rec_num_classes, sDim=args.decoder_sdim, attDim=args.attDim, max_len_labels=args.max_len, eos=dataset_info.char2id[dataset_info.EOS], STN_ON=args.STN_ON)

Load from checkpoint

if args.resume: checkpoint = load_checkpoint(args.resume) model.load_state_dict(checkpoint['state_dict'])

if args.cuda: device = torch.device("cuda") model = model.to(device) model = nn.DataParallel(model)

Evaluator

model.eval() img = image_process(args.image_path) with torch.no_grad(): img = img.to(device) input_dict = {} input_dict['images'] = img.unsqueeze(0)

TODO: testing should be more clean.

to be compatible with the lmdb-based testing, need to construct some meaningless variables.

rec_targets = torch.IntTensor(1, args.maxlen).fill(1) rec_targets[:,args.max_len-1] = dataset_info.char2id[dataset_info.EOS] input_dict['rec_targets'] = rec_targets input_dict['rec_lengths'] = [args.max_len] output_dict = model(input_dict) pred_rec = output_dict['output']['pred_rec'] predstr, = get_str_list(pred_rec, input_dict['rec_targets'], dataset=dataset_info) print('Recognition result: {0}'.format(pred_str[0]))

if name == 'main':

parse the config

os.environ['CUDA_VISIBLE_DEVICES'] = '8'

torch.backends.cudnn.enabled = False

args = get_args(sys.argv[1:]) main(args)`

2、it should debug the "loss_embed"part from the models/model_builder.py

lc1314555 commented 2 years ago

Thanks for your reply!I have found this demo from ASTER,but it doesn't work.Can you tell me how to debug the "loss_embed"part from the models/model_builder.py.

------------------ 原始邮件 ------------------ 发件人: "Pay20Y/SEED" @.>; 发送时间: 2021年11月3日(星期三) 下午4:15 @.>; @.**@.>; 主题: Re: [Pay20Y/SEED] If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you (#15)

@DYF-AI @kadirbeytorun @lc1314555 1、the demo.py can be like this:`from future import absolute_import import sys sys.path.append('./')

import argparse import os os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

import os.path as osp import numpy as np import math import time

new

from PIL import Image, ImageFile

import torch from torch import nn, optim from torch.backends import cudnn from torch.utils.data import DataLoader, SubsetRandomSampler

new

from torchvision import transforms

from config import get_args from lib import datasets, evaluation_metrics, models from lib.models.model_builder import ModelBuilder from lib.datasets.dataset import LmdbDataset, AlignCollate, CustomDataset from lib.datasets.concatdataset import ConcatDataset from lib.loss import SequenceCrossEntropyLoss from lib.trainers import Trainer from lib.evaluators import Evaluator from lib.utils.logging import Logger, TFLogger from lib.utils.serialization import load_checkpoint, save_checkpoint from lib.utils.osutils import make_symlink_if_not_exists

new

from lib.evaluation_metrics.metrics import get_str_list from lib.utils.labelmaps import get_vocabulary, labels2strs

global_args = get_args(sys.argv[1:])

def image_process(image_path, imgH=32, imgW=100, keep_ratio=False, min_ratio=1): img = Image.open(image_path).convert('RGB')

if keep_ratio: w, h = img.size ratio = w / float(h) imgW = int(np.floor(ratio imgH)) imgW = max(imgH min_ratio, imgW)

img = img.resize((imgW, imgH), Image.BILINEAR) img = transforms.ToTensor()(img) img.sub(0.5).div(0.5)

return img

new

class DataInfo(object): """ Save the info about the dataset. This a code snippet from dataset.py """ def init(self, voc_type): super(DataInfo, self).init() self.voc_type = voc_type assert voc_type in ['LOWERCASE', 'ALLCASES', 'ALLCASES_SYMBOLS'] self.EOS = 'EOS' self.PADDING = 'PADDING' self.UNKNOWN = 'UNKNOWN' self.voc = get_vocabulary(voc_type, EOS=self.EOS, PADDING=self.PADDING, UNKNOWN=self.UNKNOWN) self.char2id = dict(zip(self.voc, range(len(self.voc)))) self.id2char = dict(zip(range(len(self.voc)), self.voc)) self.rec_num_classes = len(self.voc)
def main(args): np.random.seed(args.seed) torch.manual_seed(args.seed) torch.cuda.manual_seed(args.seed) torch.cuda.manual_seed_all(args.seed) cudnn.benchmark = True torch.backends.cudnn.deterministic = True

args.cuda = args.cuda and torch.cuda.is_available() if args.cuda: print('using cuda.') torch.set_default_tensor_type('torch.cuda.FloatTensor') else: torch.set_default_tensor_type('torch.FloatTensor')

Create data loaders

if args.height is None or args.width is None: args.height, args.width = (32, 100)

dataset_info = DataInfo(args.voc_type)

Create model

model = ModelBuilder(arch=args.arch, rec_num_classes=dataset_info.rec_num_classes, sDim=args.decoder_sdim, attDim=args.attDim, max_len_labels=args.max_len, eos=dataset_info.char2id[dataset_info.EOS], STN_ON=args.STN_ON)

Load from checkpoint

if args.resume: checkpoint = load_checkpoint(args.resume) model.load_state_dict(checkpoint['state_dict'])

if args.cuda: device = torch.device("cuda") model = model.to(device) model = nn.DataParallel(model)

Evaluator

model.eval() img = image_process(args.image_path) with torch.no_grad(): img = img.to(device) input_dict = {} input_dict['images'] = img.unsqueeze(0)

TODO: testing should be more clean.

to be compatible with the lmdb-based testing, need to construct some meaningless variables.

rec_targets = torch.IntTensor(1, args.maxlen).fill(1) rec_targets[:,args.max_len-1] = dataset_info.char2id[dataset_info.EOS] input_dict['rec_targets'] = rec_targets input_dict['rec_lengths'] = [args.max_len] output_dict = model(input_dict) pred_rec = output_dict['output']['pred_rec'] predstr, = get_str_list(pred_rec, input_dict['rec_targets'], dataset=dataset_info) print('Recognition result: {0}'.format(pred_str[0]))

if name == 'main':

parse the config

os.environ['CUDA_VISIBLE_DEVICES'] = '8'

torch.backends.cudnn.enabled = False

args = get_args(sys.argv[1:]) main(args)`

2、it should debug the "loss_embed"part from the models/model_builder.py

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

Agiroy4712 commented 2 years ago

/SEED/lib/models/model_builder.py", line 73, in forward input_dict['rec_embeds'] KeyError: 'rec_embeds' You just need to annotate the code which relate the 'rec_embeds'. such the no.100 102 107 109 from the model_builder.py. When I annotate these lines, my code runs away,it will be fine.

lc1314555 commented 2 years ago

Thanks for your reply!I have solved my problem after hearing your suggestions.You are so kind to be patient to my question,thank you!

------------------ 原始邮件 ------------------ 发件人: "Pay20Y/SEED" @.>; 发送时间: 2021年11月4日(星期四) 下午2:51 @.>; @.**@.>; 主题: Re: [Pay20Y/SEED] If I want to recognize a picture, how can I do it? Do I still want to generate LMDB format? Can you provide a predict interface? Thank you (#15)

/SEED/lib/models/model_builder.py", line 73, in forward input_dict['rec_embeds'] KeyError: 'rec_embeds' You just need to annotate the code which relate the 'rec_embeds'. such the no.100 102 107 109 from the model_builder.py. When I annotate these lines, my code runs away,it will be fine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.