High Error in Real-Time Audio Data Classification Even After 85% Accuracy on Training Data

karolpiczak / EARS

EARS: Environmental Audio Recognition System

MIT License

110 stars 28 forks source link

High Error in Real-Time Audio Data Classification Even After 85% Accuracy on Training Data #12

Closed laxmimerit closed 6 years ago

laxmimerit commented 6 years ago

Hi, I have been testing your code and seen research papers too. It's nice to get working on this code. I have found that you have reported accuracy more than 80% in many cases and I have also found while retraining the model but here is a trick. It seems you have done, training, testing, and validation on the same dataset that's why it showing 85% accuracy after 100 epochs. But in the real world, It is not doing any better than hit-n-trail methods. Even a single classification is not right.

It is classifying Fan to Cat, Cat to a mouse click, keyboard to frog and so on. Overall, there is no relation between input and taget classification.

One more thing I have noticed that there is a huge jerk in recorded audio after a periodic time. perhaps, this model was not tested with unseen data, i.e. out of training data.

I am hoping, you would get back to me with a suggestion. Thanks for posting it here, at least now I can start modifying this model itself. Thanks.

karolpiczak commented 6 years ago

I have found that you have reported accuracy more than 80% in many cases

I'm not exactly sure which accuracy figures you're referring to exactly, could you point to the exact place? In general ESC-50 papers & EARS are separate things, the default model in this repo is very rudimentary.

It seems you have done, training, testing, and validation on the same dataset that's why it showing 85% accuracy after 100 epochs.

As to the validation setup - it's pretty standard procedure to do it on different folds of the same dataset. Or do you mean doing it on exactly the same data?

One more thing I have noticed that there is a huge jerk in recorded audio after a periodic time. perhaps, this model was not tested with unseen data, i.e. out of training data.

If by "jerk" you mean out-of-sync issues, then yes, that is very probable. Live recording preview is tricky with current setup, and it is more like a debug feature than a production one.

laxmimerit commented 6 years ago

Hi, Thank you for coming in the loop. Initial setup with raspberry pi and on the computer was great. But it was not able to classify audio so I thought to retrain it on ESC-50 data. I had got around 85.6% accuracy after 100 epochs by running train.py file as you suggested. After replacing the new model, It was still not able to classify real audio with some confidence. You have really done a good job but I am not able to classify real-time audio. jerk could be coming becuase of realtime preview, I agree with that. How much accuracy did you get with this model in real world noise? Did you test on real time data?

Thanks for posting your work.

karolpiczak commented 6 years ago

One thing definitely worth checking is the discrepancy between dataset recording conditions and what you get on your device (recording volume/gain) as these can vary a lot. Networks are trained on standardized features, so you have to make sure that AUDIO_MEAN and AUDIO_STD in config.py correspond more or less to your situation.

laxmimerit commented 6 years ago

Okay. I will check it. Do you have suggestions to automate it according to the environment?

laxmimerit commented 6 years ago

Hey, I am thinking to modify classify method as following. x_mean = np.mean(X) x_std = np.std(X) X -= x_mean X /= x_std

what do you think? will it work? or do you have any other suggestion?

karolpiczak commented 6 years ago

Proper automation would require a calibration step when being first run on new hardware/settings.

The best way with current codebase would be to manually check what mean/std dev values you're getting for some typical recording conditions ("normal sounds", not too loud, not too quiet) and then change the values in config.py accordingly.

stale[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open a closed issue if needed.