kukuruza / City-Project

Analyze traffic given a set of optical cameras in urban areas
0 stars 0 forks source link

Collecting training data for Faster-RCNN #44

Open kukuruza opened 9 years ago

kukuruza commented 9 years ago

I'm going move the conversation from gtalk to here.
@Lotuslisa wrote:

Do you have any ideas of labeling the data ? will we still use labelme?

I have two ideas to do that automatically:

  1. Train FasterRCNN on labelled frames (ALL cars must be labelled in a frame, so need to save reliable frames). Run two detectors: background and FasterRcnn with the original model. See if they detect all same cars (code is ready in commit 7144f5619e9bb2320757857bb89e8602f85bde3c). If so, save this frame. If at least one car is not captured by at least one detector, then something is not so easy in this frame -- discard it. a) + easy to implement b) - will take a lot of time c) - maybe such easy frames don't even exist d) - only easy frames will be saved
  2. Modify Faster-RCNN to train on cars instead of frames. It's possible according to the paper. (Need to save reliable cars, not frames). Run two detectors: background and FasterRcnn with the original model. See if some cars are detected by both. If any, save these cars and the frame. a) - need to understand and modify Faster-RCNN code b) + need less data to save

After any of these methods, human should look at the saved stuff, and remove bad.

Lotuslisa commented 9 years ago

Yeah, i agree with the idea that we first use faster rcnn to detect cars on our dataset, and then let human correct the detections. i have talked with several people working on similar tasks. They suggest to use amazon turk or find some professional company to label for us.

Lotuslisa commented 9 years ago

My friend introduce a company in Hongkong who label data for researchers. the payment is 50 rmb per hour

kukuruza commented 9 years ago

how much is that?

anyway, automatic {detection -> human pruning -> training -> detection} loop is scalable and publishable. It's in the existing papers (e.g. http://cvrc.ece.utexas.edu/Publications/tamersoy_avss2009.pdf) and industrial "miovision" uses this approach

But manual labelling I think is important for difficult conditions with dense traffic and so on

Lotuslisa commented 9 years ago

50 rmb per hour

kukuruza commented 9 years ago

what is rmb?

Lotuslisa commented 9 years ago

Yuan, in Chinese, 1 dollar = 6.2 rmb

Lotuslisa commented 9 years ago

yeah,you are right. this loop {detection -> human pruning -> training -> detection} is feasible for our project. we can check all the cameras, and group them into several groups, and train a general model for each group. in this way, we do not need to train 500 models(one model for one camera)

kukuruza commented 9 years ago

in this way, we do not need to train 500 models

Oh, that's totally so. We're doing that even for viola-jones. May be better to split models by e.g. time of the day, but for now, just 1 model for normal conditions.

per hour of what? hour of one person work? We used 8 hours to label 100 frames. That would be $50. Too much.

Lotuslisa commented 9 years ago

just per hour..no matter how many people works on it. for example, we need to label 5000 images. they will give an estimation of how many hours it costs. and then we pay them hours*50rmb. they said it cost 5s to bound a box around a car.

kukuruza commented 9 years ago

Then 50/6.2 dollars/hour / 3600 sec/hour * 5 sec/bbox * 10 bboxes/frame ~ 10 cents/frame. That's better, and similar to mech turk.

kukuruza commented 8 years ago

Alternative 2) from the first comment was chosen. Again, we only need to label reliable cars on every image, and have mask to hide other unlabelled foreground. I implemented usage of masks inside the network in https://github.com/kukuruza/py-faster-rcnn