FIRST-Tech-Challenge / fmltc

FIRST Machine Learning Toolchain
Other
39 stars 14 forks source link

0 steps processed in 20 minutes #233

Open ketterrm opened 2 years ago

ketterrm commented 2 years ago

We were running through a demo tonight with a video with just 31 frames that we had practiced yesterday to show end to end how the process works. When we tried to generate the model, we asked for 1000 steps which we had seen finish in less than 20 minutes before. But tonight is didn't process any steps and timed out after the limit of 20 minutes. The allotment of time had just been rest for us back to 300 total minutes. Can you look to see if there would be some reason that nothing would process when it worked yesterday?

ketterrm commented 2 years ago

I tried again tonight and the model ran 1000 steps in 17min 34sec as was originally expected. We just reran the job and started seeing steps returning after a few minutes. With the limit of 300 minutes, it was hard to see 20 min of our allocation go away and not getting any results. The 0 successful steps job was run from a student's account verses the successful rerun was from a coaches' account - but not sure if that has anything to do with it.

texasdiaz commented 2 years ago

There is no difference between a student vs a coach account in ftcml. Depending on your batch size for the starter model you chose, you must minimally have a batch size number of frames to train with; for most models that's 32. If you chose a 320x320 starter model and it allowed you to train, that's a bug.

Just FYI, If you're using the ftc-ml instance, versus a self hosted instance, you should be using the help/feedback forums for help.

ketterrm commented 2 years ago

We had 33 frames to train with, so we didn't run across this scenario of <32 frames