Closed sundar19 closed 4 years ago
Yes, you can use pyaudio to record audio using microphone at regular intervals. It is better to maintain a thread that keeps writing real time audio to a buffer and another thread that extracts data from buffer and passes it to model for output.
Is it not possible to have like an xml file , Like in case of OpenCV as example , and use it everytime? Like an inferred model after all the training? if I have to pass the data to the network in real time , will it cause any delay ?
To clarify what I have understood is
If you intend to use the model I trained, you can use torch.load to load the model for inference or you can also train your own model and save it using torch.save to load it later for inference. I think resnet 50 will cause a lot of delay. You can try resnet 34 or your own custom architecture and see how much delay it causes. pyaudio is just for getting audio data from microphone. Librosa is still used to get spectrogram from audio and then the spectrogram is passed to the model for output.
Ok I think you used pickle to save the model right? And I shld get the data from pyaudio and get a spectogram using librosa and pass it to the model , if I am right!
Yes
Thank you so much. Sorry for a lot of questions as I am newbie to deep learning .
I have learned something too. I am happy to help.
RuntimeError: CUDA out of memory. Tried to allocate 14.00 MiB (GPU 0; 4.00 GiB total capacity; 2.93 GiB already allocated; 5.80 MiB free; 2.94 GiB reserved in total by PyTorch)
Can you please help me with this? how to rectify this error? It is enough for me if I can train cough and sneeze datasets as of now? is it possible to train only those as my cuda runs out of memory?
I have succesfully trained the model , but after restarting my jupyter I am unable to reload my trained model from disk using pickle? can you please help! I get this error EOFError: Ran out of input
I think it might be a problem with the size of pickle file. What is the size of pickle file? Can you paste the entire error message?
EOFError Traceback (most recent call last)
I even tried unpickling and loading but it was not helpful
indtocat.pkl should not be of size 0 bytes. It is basically a dictionary to map softmax output indices to categories. I think you should verify the pickle file and it's contents.
Thank you so much , now i learned to just pickle and import the model. Sorry again to ask a lot of questions! Is there any way to seperately train only cough and sneeze Thanks!
It is fine. I learn from your questions too. You have to change last dense layer to have two units and dataset object to have two categories.
something similar to this ? train = df[df['category']='coughing','sneezing'] I can't understand from your code ,can you please pointout the output layer?
You have to pass num_cats=2 to the model. The dataset csv should contain paths to the audio files and category of each file. Number of categories will be picked up by the dataset object.
to confirm, num_cats = 2 means only two outputs cough and sneeze! How do I categorize between train and valid? like 35 of cough and sneeze to train variable and 15 of cough and sneeze to validate! then do I have to change the number of nodes in the input layers too?
For that you can just have two seperate csv and create dataframes from them with train and valid as references.
Thank you so much
is it not possible to detect the environmental sounds in real time? like can I attach a microphone to detect it in real time? thanks.