avanetten / simrdwn

Rapid satellite imagery object detection
216 stars 153 forks source link

Amount of data for training #6

Closed Raj800 closed 5 years ago

Raj800 commented 5 years ago

How much data is required in general for training? Also, if possible, can you share the pretrained weights file?

alex-service-ml commented 5 years ago

I doubt you'll get a response on the weights file; I've asked before and didn't receive a response; after looking into it, I think the data being trained on might be under NDA or otherwise not available for the public, so it's unlikely the trained model will be shared. Depending on what you want to do, I recommend checking out COWC (Cars Overhead With Context), SpaceNet, or xView as possible datasets. In my experience, a couple hundred samples will start to get some good results, although you'll probably spend a lot of time preprocessing the data to figure out what works well for you.

Raj800 commented 5 years ago

I have started training for 700 objects of 1 category only for 60,000 epochs, Lets hope it works! It took around 6 hours for 700 epochs, so its gonna take lot of time.

avanetten commented 5 years ago

We've been busy updating the code and writing papers, and updated examples (and weights) will be uploaded in the near future. In the meantime, Table 2 and 3 of https://arxiv.org/abs/1805.09512 give an idea of what you can expect for performance versus training size.

alex-service-ml commented 5 years ago

That's awesome to hear!

Raj800 commented 5 years ago

That's Great!

Raj800 commented 5 years ago

Just to make sure I am not doing something wrong, Can you take a look at this log? Batch Num: 3363 / 60000 3363: 0.000835, 0.004495 avg, 0.001000 rate, 49.680988 seconds, 215232 images Loaded: 122.635376 seconds Batch Num: 3364 / 60000 3364: 0.001208, 0.004166 avg, 0.001000 rate, 49.414196 seconds, 215296 images Loaded: 124.553726 seconds Batch Num: 3365 / 60000 bj: 0.000926, No Obj: 0.000978, Avg Recall: 0.000000, count: 1 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: 0.032712, Class: 0.976418, Obj: 0.000913, No Obj: 0.000979, Avg Recall: 0.000000, count: 1 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: 0.509558, Class: 0.976279, Obj: 0.000927, No Obj: 0.000979, Avg Recall: 1.000000, count: 1 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: 0.072283, Class: 0.976911, Obj: 0.000909, No Obj: 0.000979, Avg Recall: 0.000000, count: 2 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000977, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000978, Avg Recall: -nan, count: 0 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count: 0 Region Avg IOU: 0.000000, Class: 0.977706, Obj: 0.000875, No Obj: 0.000978, Avg Recall: 0.000000, count: 1 Region Avg IOU: 0.000000, Class: 0.976226, Obj: 0.000937, No Obj: 0.000977, Avg Recall: 0.000000, count: 1 Region Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000979, Avg Recall: -nan, count3365: 0.001845, 0.003934 avg, 0.001000 rate, 49.724026 seconds, 215360 images

avanetten commented 5 years ago

Give the https://github.com/avanetten/simrdwn/blob/master/core/prep_data_cowc.py a try, hopefully this will clear up your nans issue. As for how much training data you need, often only a few dozen is enough (see https://arxiv.org/pdf/1805.09512, Table 2).