Open JBChaz opened 3 years ago
Hmm, this is interesting. 350K examples should be enough to perform well on even rare examples. Can you upload your config.yaml
from your last model training?
To answer your questions:
feature_extractor.final_activation
from sigmoid
to softmax
in your project configuration, as well as sequence.final_activation
from sigmoid
to `softmax. config. yaml from last feature extractor training config - feature extractor.txt
config. yaml from last sequence network training config sequence network.txt
I will try the change from sigmoid to softmax.
Thanks for your help!
I am wondering if the low f1 scores are maybe related to a warning I get during feature extractor inference. I wrote a another ticket about this warning because I am not sure if it is related to this problem at all:
https://github.com/jbohnslav/deepethogram/issues/69#issue-938983021
Edit: I found out that the labels of some videos got shifted accidentally by a few frames. I corrected this and I get way better predictions now!
Really glad it works better! I am defending my thesis in a few weeks so I have very limited time at the moment. I'll take a closer look at this in August.
Hey Jim,
I am working with DEG to get it to score drinking bouts in behavioral experiments with birds. I first trained all 3 models on 15 manually labelled videos (~150 000 frames) and after that let DEG predict on 5 new videos, corrected the labels and trained the feature extractor and the sequence network with those additional videos again. I did this 4 times (so last training with a dataset of 35 videos / ~350 000 frames) hoping to get more precise predictions.
Adding this new data to the dataset did maybe improve accuracy a bit, but looking at the f1 scores there isn't really any improvement to be seen between the last and the very first training (f1 pivots between 0.15 and 0.3 for the val. dataset).
Feature extractor figures first training:
Feature extractor figures fifth training:
The behavior I am looking at has a occurrence of less than 1% of frames and I understood from the paper that the rarer the behavior the more difficult it is to achieve high f1 scores. I was wondering if you would have any suggestions on how to improve f1 scores / performance of prediction still?
I also have two questions regarding the functioning of DEG:
-Firstly: When correcting the labels I got from inference and training afterwards with them, does DEG pay specifically more attention to the frames I corrected from its prediction in comparison to the others?
-Secondly: Is there an option somewhere to indicate DEG that only 1 behavior can happen at a time? I know that in most experiments there can be multiple behaviors occurring at the same time but this is not the case in mine, so I was wondering if I could maybe indicate this somewhere and with this help the software a bit.
Thank you for any help!