groakat / AudioTagger

5 stars 2 forks source link

Changing the way how to save annotations #37

Closed groakat closed 10 years ago

groakat commented 10 years ago

If we are going to play around with the parameters of generating the spectrogram (for #33) we will need to incorporate the spectrogram parameters into the label files. Because otherwise we will not be able to trace back how the pixel values of the rectangles can be mapped into time and frequency.

To do this I propose to change the current list structure for each label in the json files into dictionary structures. I.e. rather than having something like

[2025.0, 164.0, 27.0, 45.0], "bat"]

I would like to have (incorporating current values for generating the spectogram):

{'rect': [2025.0, 164.0, 27.0, 45.0],
 'class: "bat,
 'spectogram-nstep': 0.01,
 'spectogram-nwin': 0.03}

This also allows us to add more information later., while in the same time, it would be always possible for already written scripts to work on files that have more information.

@ali-fairbrass Have you already written scripts that are parsing the saved labels? I could help you changing them to work with the new file format.

I will also provide a script that converts old saved files into new ones so that the old data is not lost and we maintain a pipeline. Please let me know whether you have any doubts about changing the way how files are saved.

groakat commented 10 years ago

maybe I can also pull @macaodha in the discussion. Do you have anything to add ?

macaodha commented 10 years ago

I have always found Pandas great for reading and writing comma delimited files. Maybe it would be easier to have each entry annotation as a separate line containing all the necessary info. This way it might be easier to import it into R. e.g.

class, x1, x2, y1, y2, spectrogram-nstep, spectrogram-nwin bat,1, 2, 3, 4, 512, 1024 ...

But I trust your wisdom, O

On 20 August 2014 15:50, groakat notifications@github.com wrote:

maybe I can also pull @macaodha https://github.com/macaodha in the discussion. Do you have to add anything?

— Reply to this email directly or view it on GitHub https://github.com/groakat/AudioTagger/issues/37#issuecomment-52789063.

ali-fairbrass commented 10 years ago

Hey Peter and Oisin,

I'm happy to do whatever you think is best.

My only reservation, as Peter predicted, is that I already have written code which parses the label data in json files to csv files: https://github.com/ali-fairbrass/dataProcessing/blob/master/jsonTocsv.py .

I'm much less familiar with dictionaries, so if you're Ok to help me adapt my code to the dictionary I'd be really grateful.

groakat commented 10 years ago

If you are working on CSV anyway, I think the best option is to save the labels in CSV. That will hopefully make your life a bit easier from there.

I had a look into your conversion file. It seems to be already doing some feature extraction. If it helps you, I can generate the CSV straight away.

So rather than saving the corners of the bounding box, I could save the startTime, endTime, minFrequency and maxFrequency.

Because they are cheap to compute, I could also save the other features you compute now as well.

I propose the following fields:

 (wav)Filename    Label    LabelTimeStamp     Spec_NStep 

Spec_NWin Spec_x1 Spec_y1 Spec_x2 Spec_y2 LabelStartTime_Seconds LabelEndTime_Seconds MinimumFreq_Hz MaximumFreq_Hz MaxAmp MinAmp MeanAmp AmpSD LabelArea_DataPoints

Compared to your fields, there are missing: SiteCode-EquipmentCode, Data, RecordingStartTime

But they are specific to your project. I think it would make sense to insert these fields with a simple script as you have already in a post processing step.

There is a bit of redundancy in saving the rectangle coordinates on the spectogram and the definition of the bounding box in frequency/time. But since the rectangle coordinates are integers and you might need the spectrogram to compute some more complex features on it later, they might come in handy. Otherwise you would have to convert the bounding box back from seconds/Hz into pixels for later feature extraction.

On 20/08/14 16:18, ali-fairbrass wrote:

Hey Peter and Oisin,

I'm happy to do whatever you think is best.

My only reservation, as Peter predicted, is that I already have written code which parses the label data in json files to csv files: https://github.com/ali-fairbrass/dataProcessing/blob/master/jsonTocsv.py .

I'm much less familiar with dictionaries, so if you're Ok to help me adapt my code to the dictionary I'd be really grateful.

— Reply to this email directly or view it on GitHub https://github.com/groakat/AudioTagger/issues/37#issuecomment-52793419.

groakat commented 10 years ago

while working on the CSV export, I noticed that your script seem not to have converted all json files in my test folder to CSV. From the code it looks as it should work, so I am not sure where the problem is, and I do not have time to test it now. But check your results anyway.

Another thing I noticed is that you hard-code the sampling-rate of the soundfile, and the SpecRows. I read the SpecRows from the height of the spectrogram that was created from the soundfile and you can get the sampling rate with scipy.io.wavfile.read('file.wav')[0]

Anyway, I am now able to export the labels to CSV. A CSV that I create looks now like

Filename,Label,LabelTimeStamp,Spec_NStep,Spec_NWin,Spec_x1,Spec_y1,Spec_x2,Spec_y2,LabelStartTime_Seconds,LabelEndTime_Seconds,MinimumFreq_Hz,MaximumFreq_Hz,MaxAmp,MinAmp,MeanAmp,AmpSD,LabelArea_DataPoints
HA86RB-13527_20130727_050000 Part 02 of 30.wav,bat,2014-08-21T00:22:10.402955,0.01,0.03,2747.0,131.0,2828.0,246.0,27.47,28.28,4323.0,8118.0,7.5673327786221485,2.3334119771490704,6.1504382045784372,0.64522362349205464,9315.0
HA86RB-13527_20130727_050000 Part 02 of 30.wav,bird,2014-08-21T00:22:10.403685,0.01,0.03,3040.0,76.0,3134.0,175.0,30.400000000000002,31.34,2508.0,5775.0,7.6160757736440328,1.836631059426014,6.1122468400592771,0.65067451430407131,9306.0
HA86RB-13527_20130727_050000 Part 02 of 30.wav,bird,2014-08-21T00:22:10.404279,0.01,0.03,2906.0,158.0,2994.0,230.0,29.060000000000002,29.94,5214.0,7590.0,7.5826063034555045,2.2299628567621785,6.1692762841461191,0.64353785729189483,6336.0
HA86RB-13527_20130727_050000 Part 02 of 30.wav,plane,2014-08-21T00:22:10.404916,0.01,0.03,3224.0,161.0,3302.0,275.0,32.24,33.02,5313.0,9075.0,7.6533440520542122,2.1114714980540539,6.1885222247406775,0.6429450018565569,8892.0

The values it generates are identical than your script generates for the same labels from json files, except for the Freq_Hz, where I think I used a more accurate sampling rate amd/or more accurate spectrogram height. (or I have a bug in my code)

I will write the import routine tomorrow and push the changes so that you can start using the ultrasonic mode in spectrograms. A script to convert the old json files in new CSV files will follow next week.

ali-fairbrass commented 10 years ago

That all sounds wonderful, thanks very much Peter.

I'll wait for you to tell me when to pull the changes.

Thanks.

groakat commented 10 years ago

I changed the audioTagger to only work on .csv files. It can save and load csv files, but not the old json files anymore.

@ali-fairbrass To convert your old json files into csv files, have a look in the new converter.py. In at the end is an example how to use it, but basically, you just do

import AudioTagger.converter as convert
convert.convertJSON2CSV('oldjsonfile.json', 'wavfile.wav')

You now just need it to loop through your old json files. It will save the new csv into the same folder where you have your json files. I would advise you not to delete your old json files, in case something is broken with the converter and we only notice later.

ali-fairbrass commented 10 years ago

Hi Peter,

That looks wonderful, thank you.

Before I pull I have a few questions.

In line 61 of converter.py, what does the value 360 refer to? Is this the number of rows in the 24kHz spectrograms? If so should I change this value to 660 when converting the 44.1kHz labels?

I'm not sure I understand what you mean with the second line in the above code. Do I point to the folders containing the json and wav files? And should I be changing the final lines in the converter.py file to my own filepaths? Sorry I think this is a stupid question, I just don't understand.

groakat commented 10 years ago

the value 360 refers to the height of the spectrogram. I changed that line so that it reads out the height of the spectrogram automatically from the spectrogram.

The second line converts only a single json label file. If you want to convert an entire folder, you need to loop over the files in your folder. This is because you will need to provide the path to the original wav file for each json file as well. I thought it would be easier if you provide the json-wav pair rather me trying to guess.