MhLiao / MaskTextSpotter

A PyTorch implementation of Mask TextSpotter
https://github.com/MhLiao/MaskTextSpotter
412 stars 94 forks source link

test issue with TEST.IMS_PER_BATCH greater than 1 #22

Open sshl opened 4 years ago

sshl commented 4 years ago

Hi all, I'm running text spotting on batches of images, with TEST.IMS_PER_BATCH = 16. Some error raises from function process_char_mask in text_inference.py, that the length of boxes doesn't match char_masks.shape[0]

MaskTextSpotter/maskrcnn_benchmark/engine/text_inference.py", line 232, in process_char_mask
    box = list(boxes[index])
IndexError: index 3 is out of bounds for axis 0 with size 3
def process_char_mask(char_masks, boxes, threshold=192):
    texts, rec_scores, rec_char_scores, char_polygons = [], [], [], []
    for index in range(char_masks.shape[0]):
       ->  box = list(boxes[index])

I try to trace back, but I find it only pick out the first element in the batch as following in text_inference.py,

def compute_on_dataset(model, data_loader, device):
    model.eval()
    results_dict = {}
    cpu_device = torch.device("cpu")
    for i, batch in tqdm(enumerate(data_loader)):
        images, targets, image_paths = batch
        images = images.to(device)
        with torch.no_grad():
            predictions = model(images)
            if predictions is not None:
                global_predictions = predictions[0]
                char_predictions = predictions[1]
                char_mask = char_predictions['char_mask']
                boxes = char_predictions['boxes']
                seq_words = char_predictions['seq_outputs']
                seq_scores = char_predictions['seq_scores']
                detailed_seq_scores = char_predictions['detailed_seq_scores']
                global_predictions = [o.to(cpu_device) for o in global_predictions]
                results_dict.update(
                ->  {image_paths[0]: [global_predictions[0], char_mask, boxes, seq_words, seq_scores, detailed_seq_scores]}
                )
    return results_dict

Is it possible to get all results from the predictions of the model? and How could we distinguish the char_mask, boxes, words in the batch for each image.

Thanks!

MhLiao commented 4 years ago

@sshl The current code does not support batch inference yet. I will try to fix it when I am available. Any pull requests are welcome if you have made it.

ChChwang commented 4 years ago

@MhLiao the error raised while training, 'AssertionError: TEST.IMS_PER_BATCH ({}) must be divisible by the number'. Should the TEST.IMS_PER_BATCH set to be the num of gpus while training?

tangtangqin commented 4 years ago

Hello, when running the test, set TEST; IMS_PER_BATCH: 1 This problem also occurs, how do you modify it?