NSGeophysics / GPRPy

Ground Penetrating Radar processing and visualization software for python
MIT License
174 stars 85 forks source link

GSSI data issues #3

Closed iannesbitt closed 5 years ago

iannesbitt commented 5 years ago

Hi Alain,

I tried to open some of my own GSSI data (example file TEST__001.DZT attached) with GPRPy and ran into some trouble. My own import function reads it just fine.

I'm relatively sure that your import script thinks the header of my DZT is 8 bytes larger than it really is. After some head scratching, I was able to at least rewrite gprIO_DZT.py to at least be able to reshape the array correctly. But it seems to still get the datatype wrong (or perhaps there's still a byteshift somewhere), so the profile looks kind of weird. I still can't quite figure out why this is the case. What I've done to temporarily lazily patch it is rewrite gprpy.importdata to use a function I wrote called readgssi.dzt.readdzt_gprpy that loads DZT header data and array in the same format as gprpy.toolbox.gprIO_DZT does.

Line 17 in gprpy.py:

from readgssi import dzt

Lines 87 and 88:

#self.data, self.info = gprIO_DZT.readdzt(filename)
self.data, self.info = dzt.readdzt_gprpy(filename)

However since my file did not use a DMI (this was a lake profile so it uses scans per second instead), it has no distance data attached to it. So I've also rewritten line 90 to handle a ZeroDivisionError in case scans per meter is zero in order to calculate profile position:

try:
    self.profilePos = self.info["startposition"]+np.linspace(0.0,
                                                             self.data.shape[1]/self.info["scpmeter"],
                                                             self.data.shape[1])
except ZeroDivisionError:
    self.profilePos = np.linspace(0.0, self.data.shape[1]/self.info["scpsec"],
                                  self.data.shape[1])

This seems to work reasonably well but since I haven't changed anything in the gui it still displays the plot position in meters.

I have the changes in a fork and will open a pull request if you'd like. Otherwise, take a look at how gprpy and readgssi import DZTs and see if you can figure out how they read the array differently.

Best, Ian

iannesbitt commented 5 years ago

P.S. The fork is here: https://github.com/iannesbitt/GPRPy

iannesbitt commented 5 years ago

Also, this file format description GSSI sent me may be of help to you: https://github.com/NSGeophysics/GPRPy/files/2810655/DZT.File.Format.6-14-16.pdf

AlainPlattner commented 5 years ago

Hi Ian,

Thanks so much for pointing this out and for providing a solution! I wonder why the header in your DZT file is different from the headers in the DZT files I had been working with. Have you tested your import with the provided example data (gprpy/exampledata/GSSI/FILE____032.DZT)?

Could it be that perhaps GSSI updated their headers at some point and you have a newer system? In that case I should probably write gprIO_dzt such that it can discern between the two versions.

Do you have an example data file that you are ok sharing that I can use to test? What year was your system manufactured? Or do you have a special version?

Thanks and cheers, Alain

iannesbitt commented 5 years ago

Alain,

From what I understand, GSSI headers have changed many times over the years. The PDF I linked to describes some calculations you can do to determine which type of header a certain file has.

Yes, I can open your DZT with readgssi: file____032_400mhzg1

I'm still not sure if I can open all of the old versions of DZT though. I've tested and been successful with probably three generations of the file structure (including some new ones with more than one radar channel) but I think there are more I haven't come across yet.

If you'd like to test with one of my files, I've provided one here: https://github.com/NSGeophysics/GPRPy/files/2810391/TEST__001.DZT.zip

I love your software and can't wait to use it with my data!

Ian

iannesbitt commented 5 years ago

Here's some more potentially useful info I have kicking around:

GSSI user manual from 2015. Pages 55-57 have similar information to that given in the 2016 DZT File Format pdf linked above: GSSI - SIR-3000 Operation Manual.pdf

A 2002 USGS document which describes the DZT file format from that era (p.85-86): usgs-fileformat.pdf

AlainPlattner commented 5 years ago

Thank so much Ian, It means a lot to me that you like GPRPy and I'm very thankful for your help with improving it. I'll try to figure out how to best work with the different DZT file formats this Saturday. I think with all the work that you have already done this should be solvable within a day or so.

iannesbitt commented 5 years ago

This may be a bigger can of worms than you're willing to open, but I also have a dual-channel example file here if you'd like to test with that as well. Antenna 0 frequency is 800 MHz, and antenna 1 frequency is 300 MHz. Previous versions of my software could read dual channel files but I'm working on fixing a bug in the current version that causes it to crash when reading.

TEST__002.DZT.zip

AlainPlattner commented 5 years ago

I think I solved the original issue. I hadn't implemented extended headers, as it says in the file format documentation that you had attached: // if rh_data < MINHEADSIZE then // offset is MINHEADSIZE rh_data // else offset is MINHEADSIZE rh_nchan I added this distinction and also corrected that while 8 bit and 16 bit samples are unsigned int, 32 bit samples are signed int. The documentation sheet says: "Eight byte and sixteen byte samples are unsigned integers. Thirty-two bit samples are signed integers." (they of course mean bit, not byte).

This should now work. I haven't implemented the automatic detection if profile is distance or recording time as if you know the speed, you can just calculate the length of the profile by hand and then use "adj profile" to set the distance, or if you want to keep it in recording time but want to change the x axis label in a figure, you can export your script and then change the label (see attached script). The data looks as follows: test1

The following script: # Automatically generated by GPRPy import gprpy.gprpy as gp mygpr = gp.gprpyProfile() mygpr.importdata('TEST__001.DZT') #This creates the standard plot (automatically generated by GPRPy) mygpr.printProfile('Test1.pdf', color='bwr', contrast=3, yrng=[0,2000], xrng=[-200,35.125], dpi=300) # This changes the profile axis label import matplotlib.pyplot as plt mygpr.showProfile( color='bwr', contrast=3, yrng=[0,2000], xrng=[-200,35.125]) plt.gca().set_xlabel('recording time [s]') # This if to make a pdf plt.savefig('Test1p.pdf',dpi=300) # This to just view it plt.show()

Creates this figure: test1p

Please let me know if this is what it should be or if there is still something I'm missing.

AlainPlattner commented 5 years ago

About the multichannel: I haven't implemented it but if there is a way that you could build into your readgssi code an option to read multichannel and then export them individually (even just as a binary file with the raw data and an ASCII header file with information), then they could be read into GPRPy afterward.

iannesbitt commented 5 years ago

It works! Thank you!

I will add to my readgssi to-do list a way to export multichannel array data to separate files with the same basename (numpy binary and JSON, for example). Would something like this work?

FILE__001_800MHz.json
FILE__001_800MHz.npy
FILE__001_300MHz.json
FILE__001_300MHz.npy

Alternatively, I could try writing to multiple separate DZTs.

AlainPlattner commented 5 years ago

Awesome! Thanks again for pointing out the issue and providing help with solving it!

For exporting multichannel data as individual files for each channel: I think something like json and npy could work. Or perhaps just a binary file of the raw data per channel and a separate ASCII text file header containing some information with keywords such as "ntraces", "traces per meter", "trace length [ns]". I'll leave it up to you what you find the most intuitive. You could also create DZT files but that could be a bit tricky as you'd need to write the header bits exactly like they do it...