Open jonfroehlich opened 5 years ago
I was thinking that it may be worthwhile to run this for the labeling scenario, it won't take more than 5-6 hours to run. But to do that, we should run it on the ground truth labels that Esther and I made, and I want to run those on the final model first. I'm still creating crops for those right now, then we'll be able to run. Creating crops is slow, but once you have crops, it's pretty fast to run with a new model.
Here are the results (for pre crop)
Dataset Size | 500 | 1000 | 5000 | 10000 | 25000 | 50000 | 100000 |
---|---|---|---|---|---|---|---|
Overall | 62.24 | 63.9 | 72.8 | 74.28 | 76.41 | 77.65 | 78.21 |
Curb Ramp | 77 | 83.53 | 83.95 | 87.25 | 90.26 | 89.09 | 92.08 |
Missing Ramp | 26.46 | 27.01 | 38.23 | 41.06 | 44.89 | 47.99 | 45.69 |
Obstruction | 43.11 | 47.13 | 63.88 | 64.94 | 66.04 | 70.04 | 71.58 |
Sfc Problem | 8.2 | 12.38 | 27.13 | 37.46 | 40.4 | 42.45 | 46.96 |
Null Crop | 79.12 | 81.73 | 86.36 | 86.83 | 87.64 | 87.89 | 87.96 |
Love it! We should discuss whether we need it for the labeling scenario as well--I'm leaning yes but not if it means we have to sacrifice, for example, the cross-city analysis...
Definitely not sacrificing cross-city analysis.
On Wed, Apr 24, 2019 at 10:46 AM Jon Froehlich notifications@github.com wrote:
Love it! We should discuss whether we need it for the labeling scenario as well--I'm leaning yes but not if it means we have to sacrifice, for example, the cross-city analysis...
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/galenweld/project_sidewalk_ml/issues/21#issuecomment-486356364, or mute the thread https://github.com/notifications/unsubscribe-auth/AACXTSSJZLTF2BMTBKXPV2LPSCMHTANCNFSM4HIDPG5Q .
This is still an open question since I believe we only did this for the validation task (and not for the labeling task) so marking for future work (however, said future work would not be for ASSETS'19 CR).
Certainly, although I believe that our results on validation should give us an excellent estimate of our performance for the labeling task. I'd say lower priority.
@galenweld ran these experiments yesterday and briefly showed us graphs. I believe this was for pre-crop performance only (validation scenario). Could we copy those results as a table (and graph) into this github issue?
Also, are we planning on running this experiment for the other scenario (labeling scenario)? I imagine this experiment will take significantly more time.