dbolya / yolact

A simple, fully convolutional model for real-time instance segmentation.
MIT License
5k stars 1.33k forks source link

How to improve? #227

Open niliuxi opened 4 years ago

niliuxi commented 4 years ago

Calculating mAP...

       |  all  |  .50  |  .55  |  .60  |  .65  |  .70  |  .75  |  .80  |  .85  |  .90  |  .95  |
-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
   box | 12.25 | 19.34 | 19.34 | 19.34 | 19.34 | 16.96 | 16.96 | 10.43 |  0.83 |  0.00 |  0.00 |
  mask | 16.04 | 19.34 | 19.34 | 19.34 | 19.34 | 16.96 | 16.96 | 16.96 | 16.96 | 15.18 |  0.00 |
-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+

This is my result. How can I improve?

I only trained one kind, the result should not be higher?

dbolya commented 4 years ago

Sorry, can you go over exactly what your training procedure was, what data you're using, and what your parameters are?

niliuxi commented 4 years ago

I used the coco dataset of the date that I made myself, and the parameters were not modified.

dbolya commented 4 years ago

I mean what kind of dataset, what's your dataset config look like, and what training command did you use? That low mAP might indicate that you set something up wrong.

niliuxi commented 4 years ago

I'm using the coco dataset, which has a hundred images. my_custom_dataset = dataset_base.copy({ 'name': 'My Dataset',

'train_images': './data/coco/images/',
'train_info':   './data/coco/annotations/train2.json',

'valid_images': './data/coco/images/',
'valid_info':   './data/coco/annotations/train1.json',

'has_gt': True,
'class_names': ('zao'),
'label_map':{1:1}

}) yolact_base_config = coco_base_config.copy({ 'name': 'yolact_base',

# Dataset stuff
'dataset': my_custom_dataset,
'num_classes': len(my_custom_dataset.class_names) + 1,

I modified these.

niliuxi commented 4 years ago

I found this mistake. Thank you. By the way, what do I mean by B, C, m, s and t respectively? How can I get the position information of the target object?

dbolya commented 4 years ago

Yeah your mistake was 'class_names': ('zao'), should be 'class_names': ('zao',), (just pointing that out in case you found a different mistake).

The loss codes are as follows:


        # Loss Key:
        #  - B: Box Localization Loss
        #  - C: Class Confidence Loss
        #  - M: Mask Loss
        #  - P: Prototype Loss
        #  - D: Coefficient Diversity Loss
        #  - E: Class Existence Loss
        #  - S: Semantic Segmentation Loss

(Then new I one is Mask IoU Loss)

And where would you like to get this target information (you mean getting the x and y of the gt)?

niliuxi commented 4 years ago

I've found that when I use your project released in September, I get higher results, but when I use your updated project, the results are very low.

Yes, I want to get the coordinates of the boundingbox when testing. By the way, can the test results be visualized.

niliuxi commented 4 years ago

Hello, dbolya,I'd like to know what the following output means? Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.745 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.941 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.903 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.730 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.800 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.744 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.800 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.800 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.769 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.837

dbolya commented 4 years ago

The numbers you care about there are the top one (0.745 = 74.5 in my eval.py) and the 2 ones under it (94.1 and 90.3). Since the maximum AP is 1 (or 100 in the way I report the numbers), those scores are pretty good.

Are you sure there's that much difference between the September version and the current version? That'd be worrisome.

edwardnguyen1705 commented 4 years ago

I found this mistake. Thank you. By the way, what do I mean by B, C, m, s and t respectively? How can I get the position information of the target object?

I face a similar problem. What is the mistake you found?

pankaja0285 commented 3 years ago

I am facing a problem with my training the model for the first time, in my case I have one class and I did set it as "class_names": ("Dauca",) and ran with image max_size = 850 and iterations = 60,

I get the following error: Scaling parameters by 0.25 to account for a batch size of 2. Per-GPU batch size is less than the recommended limit for batch norm. Disabling batch norm. train image_path=../dataset/train loading annotations into memory... Done (t=0.44s) creating index... index created! loading annotations into memory... Done (t=0.17s) creating index... index created! val image_path=../dataset/val /home/ec2-user/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/torch/jit/_recursive.py:165: UserWarning: 'lat_layers' was found in ScriptModule constants, but it is a non-constant submodule. Consider removing it. " but it is a non-constant {}. Consider removing it.".format(name, hint)) /home/ec2-user/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/torch/jit/_recursive.py:165: UserWarning: 'pred_layers' was found in ScriptModule constants, but it is a non-constant submodule. Consider removing it. " but it is a non-constant {}. Consider removing it.".format(name, hint)) /home/ec2-user/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages/torch/jit/_recursive.py:165: UserWarning: 'downsample_layers' was found in ScriptModule constants, but it is a non-constant submodule. Consider removing it. " but it is a non-constant {}. Consider removing it.".format(name, hint)) Initializing weights... cfg max_size: 850 Start iteration: 0 Max iterations: 240.0 len(dataset): 441 args.batch_size: 2 epoch_size: 220 calculated num_epochs= 2 Begin training!

Number of epochs: 2 Contd. iter counter: 0 on check cond-> (epoch+1)*epoch_size < iteration [W TensorIterator.cpp:918] Warning: Mixed memory format inputs detected while calling the operator. The operator will output contiguous tensor even if some of the inputs are in channels_last format. (function operator()) At epoch:0 in iteration:0 [ 0] 0 || B: 5.137 | C: 5.871 | M: 6.755 | S: 0.713 | T: 18.476 || ETA: 0:37:18 || timer: 9.327 At epoch:0 in iteration:10 [ 0] 10 || B: 4.468 | C: 4.056 | M: 5.778 | S: 0.539 | T: 14.841 || ETA: 0:07:09 || timer: 0.267 At epoch:0 in iteration:20 [ 0] 20 || B: 4.380 | C: 3.590 | M: 5.581 | S: 0.386 | T: 13.936 || ETA: 0:06:46 || timer: 5.309 At epoch:0 in iteration:30 [ 0] 30 || B: 4.411 | C: 3.377 | M: 5.226 | S: 0.308 | T: 13.321 || ETA: 0:05:16 || timer: 0.270 At epoch:0 in iteration:40 [ 0] 40 || B: 4.367 | C: 3.227 | M: 5.029 | S: 0.270 | T: 12.893 || ETA: 0:04:29 || timer: 0.239 At epoch:0 in iteration:50 [ 0] 50 || B: 4.319 | C: 3.125 | M: 4.900 | S: 0.246 | T: 12.589 || ETA: 0:04:12 || timer: 0.510 At epoch:0 in iteration:60 [ 0] 60 || B: 4.242 | C: 3.032 | M: 4.866 | S: 0.220 | T: 12.360 || ETA: 0:03:57 || timer: 0.247 At epoch:0 in iteration:70 [ 0] 70 || B: 4.262 | C: 2.977 | M: 4.783 | S: 0.209 | T: 12.231 || ETA: 0:03:28 || timer: 0.247 At epoch:0 in iteration:80 [ 0] 80 || B: 4.275 | C: 2.928 | M: 4.728 | S: 0.198 | T: 12.129 || ETA: 0:03:13 || timer: 4.119 Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/tensorflow2_p36/lib/python3.6/multiprocessing/queues.py", line 234, in _feed obj = _ForkingPickler.dumps(obj) File "/home/ec2-user/anaconda3/envs/tensorflow2_p36/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'numpy.core._exceptions.UFuncTypeError'>: it's not the same object as numpy.core._exceptions.UFuncTypeError

Please help. If go below 60 for max_iterations I get output but the mAP values are very poor.