Taguchi optimisation - xcorr

mzunguman commented 8 years ago

@craig-warren,

I have a question regarding the cross-correlation fitness metric in the taguchi optimisation process. I'm working with that at the moment and I have trouble entering correct 'stop' values. As the amplitudes are almost trivial anyways, it makes more sense to me to use a normalized cross-correlation, because otherwise one needs to know exactly what his stopping criterion is. With a normalized xcross, I just need to enter a value between -1 and 1, in my case ~0.95. I updated my local share of the repository by adding a few lines to enable this and checked outside of grpmax if the python code works. When I manually calculate the normalized xcross with the same reference file and an output file from gprMax I get a sensible value of ~0.8. However for the same output file, gprMax previously gave me a fitness value of above 1. Do you have an idea regarding this? I tried my best but I can't find an answer. If you're interested in my normalized cross-correlation addition I will be happy to send it.

Cheers, Sam

craig-warren commented 8 years ago

@mzunguman if you can show me your normalized cross-correlation code, maybe that will help find where the problem is occurring.

mzunguman commented 8 years ago

@craig-warren

Allright so here's it goes, this belongs in the xcorr part of the "fitness_functions.py" file. Please excuse the poor coding. I commented out your part and added mine, as well as added the last part for the normalization.

When you make both responses the same length in time I think there is an error in shortening the longer signal. Could have missed something, anyways my code seems to work for now.

I was just again looking at the part starting on line 142 in "optimisation_taguchi.py", maybe theres an issue. But the fitness value is not changed in this .py file it seems.

    # Make both responses the same length in time
    if reftime[-1] > modeltime[-1]:
        endtime = modeltime[-1]
        a=np.amax(np.where(reftime - endtime + reftime[1] <= reftime[1]/2))   # find index in modeltime of time equal to endtime
        reftime = reftime[0:a+1]
        refresp = refresp[0:a+1]
        #reftime = np.arange(0, f.attrs['dt'] * f.attrs['Iterations'], reftime[-1] / len(reftime))
        #refresp = refresp[0:len(reftime)]
    elif modeltime[-1] > reftime[-1]:
        endtime = reftime[-1]
        a=np.amax(np.where(modeltime - endtime + modeltime[1] <= modeltime[1]/2))   # find index in modeltime of time equal to endtime
        modeltime = modeltime[0:a+1]
        modelresp = modelresp[0:a+1]
        #modeltime = np.arange(0, reftime[-1], f.attrs['dt'])
        #modelresp = modelresp[0:len(modeltime)]

    # Downsample the response with the higher sampling rate
    if len(modeltime) < len(reftime):
        refresp = signal.resample(refresp, len(modelresp))
    elif len(reftime) < len(modeltime):
        modelresp = signal.resample(modelresp, len(refresp))

    # Prepare data first. Check formula for normalized cross-correlation
    refresp = (refresp - np.mean(refresp)) / (np.std(refresp) * len(refresp))
    modelresp = (modelresp - np.mean(modelresp)) / np.std(modelresp)

craig-warren commented 8 years ago

@mzunguman I can't see an error in the original code for shortening the signals, and you are correct the fitness value is not affected by the optimisation_taguchi.py module.

After you have prepared the data for a normalised xcorr, are you executing?:

xcorr = signal.correlate(refresp, modelresp) xcorrmax = np.amax(xcorr)

I'm not sure why there was a / 100 in the xcorrmax line.

mzunguman commented 8 years ago

Yes, those lines execute of course. I also wasn't sure about the / 100, so I also erased that. The weird thing is like I said, that manually trying out the xcorr code-snippet outside of gprMax, with the same reference file and the "optimised" output file from gprMax, gives me a different and more sensible correlation value. I only altered what I previously wrote you.

craig-warren commented 8 years ago

Have you tried plotting and printing the xcorr value from within the function?

mzunguman commented 8 years ago

sorry about the late reply. Yep I'm doing that at the moment. Ill get back when I know more.

mzunguman commented 8 years ago

@craig-warren I found my error. The reference file was simply flipped. Sorry about the fuss. Maybe you'll find the normalized cross-correlation code useful though. Other users might not make this mistake, but maybe you could add a remark in the docs about the reference file being in the same unit (Volts from real measurements, or Ey from gprMax receivers for example).

craig-warren commented 8 years ago

@mzunguman OK, glad you found your issue. I'm doing some more investigation of the cross-corellation anyone, as I am not satisfied that the current implementation is robust enough.

mzunguman commented 8 years ago

@craig-warren I've done some more testing and research. I think the approach of the normalization is correct, I tried 1-2 other ways of calculating the normalized xcorr and I always get the same result, compared to gprMax.

I have a further question. The following image shows the result of an optimization with the same code I sent you. The xcorrmax-value is ~ 0.975, which is quite high. Just looking at the curves though, it doesn't seem to me that the correlation is that high. This is just a visual inspection, but I looked at curves with similar xcorr-values and they "look" much more similar than mine. I don't necessarily need a better fit, but I think the xcorr-value should be lower. So I guess I'm asking if this is plausible to you?

signal_similarity

gprMax / gprMax

Taguchi optimisation - xcorr #62