Open DeclanE101 opened 2 years ago
Generally, adding in your own data should have the best result. This is a rather tiny network compared to others (say U-Net) and learns rather quickly.
This is most likely a distribution shift in the visual appearance between training (our data) and inference (your data). Introducing your data to be trained on should allow the network to adapt. This could be either training exclusively on your data or on a mix of ours and your data.
There are 2 other things that could potentially be done to see if they improve performance:
frame = np.uint8(next(im_iter)*128/190)
. The network does apply per-image standardization, so this may not do much. However, other methods of altering the visual appearance to be closer to ours could work. The 2 distinct arena appearances present in the full model are a white floor with gray walls and gray floor with white walls.Thank you for the response, let me try these few things and I'll comment again when I get some results.
I've been trying to use this code for inference of top view mice videos and I've been using the Full Model pretrained data set with 415000 steps. I've also tried using different pretrained models to see if I can get better results, but the Full Model has given me the best results. I am looking for some advice on how to get better results, like, is it better to train the model more than 415000 steps or is it more worthwhile to use the provided tools to create a dataset from our own data? Our data has a few objects in the environment with the mouse and this program is really promising because it has worked on data that includes mice and extra objects. Currently it predicts the mouse's position about 30% of the time but am unable to figure out how to do better. I was curious if there was a way we could discuss over zoom or some other medium? I would really appreciate any help/advice!