hasithsura / Environmental-Sound-Classification

29 stars 10 forks source link

recognition in real time #1

Closed sundar19 closed 4 years ago

sundar19 commented 4 years ago

is it not possible to detect the environmental sounds in real time? like can I attach a microphone to detect it in real time? thanks.

hasithsura commented 4 years ago

Yes, you can use pyaudio to record audio using microphone at regular intervals. It is better to maintain a thread that keeps writing real time audio to a buffer and another thread that extracts data from buffer and passes it to model for output.

sundar19 commented 4 years ago

Is it not possible to have like an xml file , Like in case of OpenCV as example , and use it everytime? Like an inferred model after all the training? if I have to pass the data to the network in real time , will it cause any delay ?

sundar19 commented 4 years ago

To clarify what I have understood is

  1. You suggest to define a seperate function for pyaudio to get realtime data inside a buffer from mic and make it one thread
  2. Then another function to proceed with getting the data from the previous buffer and load it inside the neural network and make it another thread. So there is no need of librosa in my case , I can proceed with data coming from my pyaudio?
hasithsura commented 4 years ago

If you intend to use the model I trained, you can use torch.load to load the model for inference or you can also train your own model and save it using torch.save to load it later for inference. I think resnet 50 will cause a lot of delay. You can try resnet 34 or your own custom architecture and see how much delay it causes. pyaudio is just for getting audio data from microphone. Librosa is still used to get spectrogram from audio and then the spectrogram is passed to the model for output.

sundar19 commented 4 years ago

Ok I think you used pickle to save the model right? And I shld get the data from pyaudio and get a spectogram using librosa and pass it to the model , if I am right!

hasithsura commented 4 years ago

Yes

sundar19 commented 4 years ago

Thank you so much. Sorry for a lot of questions as I am newbie to deep learning .

hasithsura commented 4 years ago

I have learned something too. I am happy to help.

sundar19 commented 4 years ago

RuntimeError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 4.00 GiB total capacity; 2.93 GiB already allocated; 5.80 MiB free; 2.94 GiB reserved in total by PyTorch)

Can you please help me with this? how to rectify this error? It is enough for me if I can train cough and sneeze datasets as of now? is it possible to train only those as my cuda runs out of memory?

sundar19 commented 4 years ago

I have succesfully trained the model , but after restarting my jupyter I am unable to reload my trained model from disk using pickle? can you please help! I get this error EOFError: Ran out of input

hasithsura commented 4 years ago

I think it might be a problem with the size of pickle file. What is the size of pickle file? Can you paste the entire error message?

sundar19 commented 4 years ago

EOFError Traceback (most recent call last)

in 2 import pickle 3 with open('indtocat.pkl','rb') as f: ----> 4 indtocat = pickle.load(f) 5 filename='/content/ESC-50-master/audio/1-116765-A-41.wav' 6 spec=spec_to_image(get_melspectrogram_db(filename)) EOFError: Ran out of input This is the error I get & my indtocat.pkl file size is 0 bytes? is the file saved correctly? Even if I have to retrain again how can I pickle it and save it?
sundar19 commented 4 years ago

I even tried unpickling and loading but it was not helpful

hasithsura commented 4 years ago

indtocat.pkl should not be of size 0 bytes. It is basically a dictionary to map softmax output indices to categories. I think you should verify the pickle file and it's contents.

sundar19 commented 4 years ago

Thank you so much , now i learned to just pickle and import the model. Sorry again to ask a lot of questions! Is there any way to seperately train only cough and sneeze Thanks!

hasithsura commented 4 years ago

It is fine. I learn from your questions too. You have to change last dense layer to have two units and dataset object to have two categories.

sundar19 commented 4 years ago

something similar to this ? train = df[df['category']='coughing','sneezing'] I can't understand from your code ,can you please pointout the output layer?

hasithsura commented 4 years ago

You have to pass num_cats=2 to the model. The dataset csv should contain paths to the audio files and category of each file. Number of categories will be picked up by the dataset object.

sundar19 commented 4 years ago

to confirm, num_cats = 2 means only two outputs cough and sneeze! How do I categorize between train and valid? like 35 of cough and sneeze to train variable and 15 of cough and sneeze to validate! then do I have to change the number of nodes in the input layers too?

hasithsura commented 4 years ago

For that you can just have two seperate csv and create dataframes from them with train and valid as references.

sundar19 commented 4 years ago

Thank you so much