Performance as a function of training set size

ProjectSidewalk / sidewalk-cv-assets19

Repo for our ASSETS'19 paper applying ResNet to Project Sidewalk data

5 stars 6 forks source link

Performance as a function of training set size #21

Open jonfroehlich opened 5 years ago

jonfroehlich commented 5 years ago

@galenweld ran these experiments yesterday and briefly showed us graphs. I believe this was for pre-crop performance only (validation scenario). Could we copy those results as a table (and graph) into this github issue?

Also, are we planning on running this experiment for the other scenario (labeling scenario)? I imagine this experiment will take significantly more time.

galenweld commented 5 years ago

I was thinking that it may be worthwhile to run this for the labeling scenario, it won't take more than 5-6 hours to run. But to do that, we should run it on the ground truth labels that Esther and I made, and I want to run those on the final model first. I'm still creating crops for those right now, then we'll be able to run. Creating crops is slow, but once you have crops, it's pretty fast to run with a new model.

Here are the results (for pre crop) Performance Improvement with Additional Training Data

Dataset Size	500	1000	5000	10000	25000	50000	100000
Overall	62.24	63.9	72.8	74.28	76.41	77.65	78.21
Curb Ramp	77	83.53	83.95	87.25	90.26	89.09	92.08
Missing Ramp	26.46	27.01	38.23	41.06	44.89	47.99	45.69
Obstruction	43.11	47.13	63.88	64.94	66.04	70.04	71.58
Sfc Problem	8.2	12.38	27.13	37.46	40.4	42.45	46.96
Null Crop	79.12	81.73	86.36	86.83	87.64	87.89	87.96

jonfroehlich commented 5 years ago

Love it! We should discuss whether we need it for the labeling scenario as well--I'm leaning yes but not if it means we have to sacrifice, for example, the cross-city analysis...

galenweld commented 5 years ago

Definitely not sacrificing cross-city analysis.

On Wed, Apr 24, 2019 at 10:46 AM Jon Froehlich notifications@github.com wrote:

Love it! We should discuss whether we need it for the labeling scenario as well--I'm leaning yes but not if it means we have to sacrifice, for example, the cross-city analysis...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/galenweld/project_sidewalk_ml/issues/21#issuecomment-486356364, or mute the thread https://github.com/notifications/unsubscribe-auth/AACXTSSJZLTF2BMTBKXPV2LPSCMHTANCNFSM4HIDPG5Q .

jonfroehlich commented 5 years ago

This is still an open question since I believe we only did this for the validation task (and not for the labeling task) so marking for future work (however, said future work would not be for ASSETS'19 CR).

galenweld commented 5 years ago

Certainly, although I believe that our results on validation should give us an excellent estimate of our performance for the labeling task. I'd say lower priority.