chaosparrot / parrot.py

Computer interaction using audio and speechrecognition
MIT License
139 stars 36 forks source link

Added microphone separator setting #12

Closed AndreasArvidsson closed 2 years ago

AndreasArvidsson commented 2 years ago

Using the setting MICROPHONE_SEPARATOR the user can have multiple recorded noises with different names that will be combined on training as long as they have the same prefix and separator.

eg: MICROPHONE_SEPARATOR = "--"

pop--dpa => pop pop--akg => pop

Selecting categories to train on... ( [Y]es / [N]o / [S]kip )
 - background_dpa
 - cluck_akg
 - cluck_dpa
 - cluck_mod
 - hiss_akg
 - hiss_dpa
 - hiss_mod
 - pop_akg
 - pop_dpa
 - pop_mod
 - shush_akg
 - shush_dpa
 - shush_mod
 - speech_dpa
Loaded 5093 .wav files for category background (id: 62892429.0)
Loaded 12592 .wav files for category cluck (id: 56447280.0)
Loaded 12715 .wav files for category hiss (id: 24464102.0)
Loaded 12568 .wav files for category pop (id: 65491759.0)
Loaded 12583 .wav files for category shush (id: 45231884.0)
Loaded 5235 .wav files for category speech (id: 90180910.0)
--------------------------
Learning the data...
chaosparrot commented 2 years ago

Tested this out on Windows 10 - Doesn't seem to have anything that would suggest operating system differences so I didn't bother to test on Linux.

Recording does well, training the data goes well on Random Forest so it should go well on the other SKLEARN models, Audionet training goes well. The one area I managed to find a discrepancy is with analyzing the data ( A -> M menu in settings flow )

So I replaced line 304 through line 314 in lib/test_data.py with this:

    print( "Analysing..." )
    true_wav_file_labels = []
    predicted_wav_file_labels = []
    for index, sound in enumerate(available_sounds):
        if( sound in classifier.classes_ or ( MICROPHONE_SEPARATOR and sound.split( MICROPHONE_SEPARATOR )[0] in classifier.classes_ ) ):
            # First sort the wav files by time
            recordings_dir = os.path.join(RECORDINGS_FOLDER, sound )
            wav_files = os.listdir(recordings_dir)            
            full_wav_files = []

            print( "----- " + str(sound) + " -----" )
            if MICROPHONE_SEPARATOR:
                sound = sound.split( MICROPHONE_SEPARATOR )[0]

If you would be so kind to add this to the PR I will merge it :)

AndreasArvidsson commented 2 years ago

Updated test data as requested. Please try it out to make sure everything works before merging :)

chaosparrot commented 2 years ago

Tested it out, seems to work just fine :)