OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
98 stars 73 forks source link

Empty filter coeffs in EK80 .raw file #720

Closed martin-wegmann closed 2 years ago

martin-wegmann commented 2 years ago

Hi everyone,

I am super new to the world of fish monitoring and acoustics, so treat me as an absolute amateur please. I need your help regarding an issue reading in EK60 data.

I am using version echopype==0.5.6 and Python 3.7.9 and was recently able to read-in EK60 example data in .raw format. The example data I used to play around with was 68MB large.

Now I was switching to larger input EK60 data, yet still .raw data, and got an error message while not changing the commands.

I know for a fact that the model is an EK60. The file size is 525 MB.

import echopype as ep ifrom echopype import open_raw lexfish_folder="/Volumes/Lexplore/LéXFISH/" ed = open_raw(lexfish_folder+"D20210727-T000128.raw", sonar_model='EK60')

I got the following message that I didn't get before:

11:15:45 parsing file D20210727-T000128.raw, time of first ping: 2021-Jul-27 00:01:28

KeyError Traceback (most recent call last)

in ----> 1 ed = open_raw(lexfish_folder+"D20210727-T000128.raw", sonar_model='EK60') ~/miniconda2/envs/temprec/lib/python3.7/site-packages/echopype/convert/api.py in open_raw(raw_file, sonar_model, xml_path, convert_params, storage_options) 428 output_path=None, 429 sonar_model=sonar_model, --> 430 params=_set_convert_params(convert_params), 431 ) 432 # Set up echodata object ~/miniconda2/envs/temprec/lib/python3.7/site-packages/echopype/convert/set_groups_ek60.py in __init__(self, *args, **kwargs) 28 self.old_ping_time = None 29 # correct duplicate ping_time ---> 30 for ch in self.parser_obj.config_datagram["transceivers"].keys(): 31 ping_time = self.parser_obj.ping_time[ch] 32 _, unique_idx = np.unique(ping_time, return_index=True) KeyError: 'transceivers'

Super thankful for any help regarding this issue.

Cheers, Martin

leewujung commented 2 years ago

@martin-wegmann : Thanks for reporting this -- this is interesting, we have not seen a file with this field being empty. This should not be a problem with the size of the file, but somehow the transceivers field is missing in the configuration datagram. This does mean that we should make the code more robust with respect to this type of omission. Give us a week or so to look into this. Could you try if there's a smaller file you could share with us that would produce a similar error?

martin-wegmann commented 2 years ago

Thanks for the quick update. I appreciate it. Unfortunately all the files from our 3 TB database are that size. Is there any good open-source way how to cut .raw files into smaller time bits? If not, I asked the author of the data to provide me with a 5 MB file. He has the commercial software to do so.

I will get back to you ASAP.

cornejotux commented 2 years ago

@martin-wegmann you can use the software EK80 or maybe EK60 to re-play the file and save/record it into a new file. You just record a few seconds and your final file will be very small.

martin-wegmann commented 2 years ago

Hi everyone,

so I got the following additional information from the author of the data:

I attached a smaller example file from the same sample pool.

Thanks for your help, I appreciate it greatly!

example_EK60.zip

leewujung commented 2 years ago

@martin-wegmann : Could you give #724 a try? Your file is indeed from EK80 and there's a small bug that I just fixed when saving the data into appropriate netCDF groups (so I changed the title of this issue). Thanks!

@emiliom @b-reyes : We should probably implement the mechanism for echopype to distinguish EK60 and EK80 files automatically #494 based on the presence of XML0 or CON0 datagrams to make it easier to debug the actual error in file conversion.

emiliom commented 2 years ago

@emiliom @b-reyes : We should probably implement the mechanism for echopype to distinguish EK60 and EK80 files automatically https://github.com/OSOceanAcoustics/echopype/issues/494 based on the presence of XML0 or CON0 datagrams to make it easier to debug the actual error in file conversion.

I've added #494 to Milestone 0.6.1. It doesn't mean we have to include it in that milestone, but it'll keep it in our near-term attention.

leewujung commented 2 years ago

@emiliom : I'll see if I can try to get fix for #494 in in the next few days, on my streak of evening echopype effort. ;)

martin-wegmann commented 2 years ago

@leewujung could you give me a tiny manual how to get the new fix set up on my machine? I am a bit lost right now...

leewujung commented 2 years ago

@martin-wegmann : yes! You can use the following to set up a conda environment with all the echopype requirements:

conda create -c conda-forge -n echopype-test --yes python=3.9 --file requirements.txt

Activate into that environment:

conda activate echopype-test

And then do the following to pip install my branch that contains the changes in #724

pip install git+https://github.com/leewujung/echopype@fix-ek80-fil

You can check all packages installed in this environment by doing

conda list

Let me know if these work!

leewujung commented 2 years ago

Closing this as the fix is merged into dev.

martin-wegmann commented 2 years ago

Hi guys,

sorry for the late reply. I had a tough schedule the last two weeks.

I tried out your fix with one of the original 500MB samples. I think I got good news and bad news.

So with the information of the author I could read in the data and I didn't get an error message! Yeah! I could get the backscattering info out of it. I attached a plot of 3 depths.

But the post processing into Sv and MVBS is not working correctly I think (second plot attached).

With waveform_mode="CW",encode_mode="power", which is the correct information, my Sv data for one depth shows one value at the first time step and the rest is nan.

MVBS then just takes this single value and projects this onto the time binning I selected. So if I selected 60s, I get one value for the first minute and the rest is nan.

Any clue what could be the issue?

Edit: I added a 2D plot of backscattering in the attachment.

sv_plot backscattering backscatter_2d

emiliom commented 2 years ago

@leewujung could this issue with MVBS be the same one you've addressed in PR #736?

leewujung commented 2 years ago

Hmm, no, this looks like something else. I didn't look into the actual data content of the file @martin-wegmann provided while working #724. I'll look into this later today.

leewujung commented 2 years ago

@martin-wegmann : Thanks for reporting this! The source of the error you ran into was a bug introduced in v0.6.0 due to a change of coordinate variable name in the echodata[Environment] group. Stay tuned -- I'll ping you on the PR when it is ready to be tested.

martin-wegmann commented 2 years ago

Thank you so much, I appreciate your help!

leewujung commented 2 years ago

@martin-wegmann : The issue with empty Sv is the one #755 addresses (I just tested with the example file you provided above), so Sv generation should be fine now.

There's some problem with compute_MVBS but if your data is very regular (as in no alternating ping sequence, and all channels are of the same number of samples -- which seems to be the case for your example file), the current version should work (i.e. if you just pull the #755 branch). If not, #753 should address that.

We are working to get v0.6.1 released by the end of this week so that users don't continued to be affected by the bugs introduced in v0.6.0.

martin-wegmann commented 2 years ago

Works like a freaking charm (pulled the #755 branch)!

You guys are amazing! Thank you so much for your hard work! Can I donate a coffee or beer somehow?

leewujung commented 2 years ago

Thanks @martin-wegmann, glad to know that fix worked for you! We have just released v0.6.1, so you can pip install that now, or once the new version gets into conda-forge (typically within a day) you could create a new environment altogether.

Don't worry about donating, but please keep bug reports or brainstorming coming! If we get a chance to meet in person somewhere, it'll be great to meet up and chat over a drink! 😀

Update: it is now on conda-forge!