Closed panichem closed 2 years ago
Hey @panichem!
Are you able to use the sample dataset from the tutorial?
If so, or if you just want to give it a quick try, does training a bottom-up multi-animal model work? (This works for single animals as well.)
Give those a spin and if neither works, do you mind sharing the video + .slp file with talmo@salk.edu
?
Talmo
Hi @panichem,
I also came across this when I created a model where the receptive field (RF) size was relatively small compared to the overall frame size. You could try lowering the input scaling to ~0.5 (which increases the RF size) and see how that effects the first epoch training time. Please let us know if any of these solutions worked.
Thanks, Liezl
@talmo @roomrys - I switched to a bottom-up model and changed the RF scaling to .5 and now the first ~10 epochs are done in a few minutes. Thanks for your help!!
Marking this as a TODO since there is a work-around, but we still need to find the root cause (and prevent it from happening)
The fact that there weren't any errors and that training didn't even start makes me think it's a tensorflow deadlock.
We've run into this in the past (see attempted fixes in https://github.com/talmolab/sleap/commit/613c20119e992a0a3309cf0e99a8648cc6818cb0 and https://github.com/talmolab/sleap/commit/492b67b6b0325fa0f46e6abcbf7fef5e580a5bde). I think it's related to how we use tf.py_function
-- there's a thread about it over in https://github.com/tensorflow/tensorflow/issues/32454, but no solution.
In the past I've had a hard time reliably reproducing this -- it seems to be stochastic and maybe system-dependent -- so maybe let's just close this for now and revisit it if more people are having the same problem.
Also moving this to Discussions so folks see it when asking q's.
Thanks for the report @panichem!
Hey @talmo !
I'm working through the sleap tutorial on a PC with a decent GPU. The initial training step is taking a bit of time though...:
Here's the dump from terminal - can't really see anything weird. Any idea what I need to do differently?