OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
94 stars 73 forks source link

Cannot open simrad ES60 raw files #1195

Closed merin-joseph closed 9 months ago

merin-joseph commented 10 months ago

General description of problem

Cannot open simrad ES60 raw files

Computing environment

The following code reproduces the errors I encountered:

ed = open_raw('L0012-D20190319-T192316-ES60.raw',sonar_model='EK60') 
ed

Error message printouts

Below is the error messages I received when running the above code:

---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
Cell In[5], line 1
----> 1 ed = open_raw('L0012-D20190319-T192316-ES60.raw',sonar_model='EK60') # EK80
      2 ed

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/utils/prov.py:237, in add_processing_level.<locals>.wrapper.<locals>.inner(*args, **kwargs)
    235 @functools.wraps(func)
    236 def inner(*args, **kwargs):
--> 237     dataobj = func(*args, **kwargs)
    238     if is_echodata:
    239         ed = dataobj

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/api.py:429, in open_raw(raw_file, sonar_model, xml_path, convert_params, storage_options, use_swap, max_mb)
    424 # Parse raw file and organize data into groups
    425 parser = SONAR_MODELS[sonar_model]["parser"](
    426     file_chk, params=params, storage_options=storage_options, dgram_zarr_vars=dgram_zarr_vars
    427 )
--> 429 parser.parse_raw()
    431 # Direct offload to zarr and rectangularization only available for some sonar models
    432 if sonar_model in ["EK60", "ES70", "EK80", "ES80", "EA640"]:
    433     # Create sonar_model-specific p2z object

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/parse_base.py:154, in ParseEK.parse_raw(self)
    148         self.CON1_datagram = None
    150     # IDs of the channels found in the dataset
    151     # self.ch_ids = list(self.config_datagram['configuration'].keys())
    152 
    153     # Read the rest of datagrams
--> 154     self._read_datagrams(fid)
    156 if "ALL" in self.data_type:
    157     # Convert ping time to 1D numpy array, stored in dict indexed by channel,
    158     #  this will help merge data from all channels into a cube
    159     for ch, val in self.ping_time.items():

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/parse_base.py:240, in ParseEK._read_datagrams(self, fid)
    235 while True:
    236     try:
    237         # TODO: @ngkvain: what I need in the code to not PARSE the raw0/3 datagram
    238         #  when users only want CONFIG or ENV, but the way this is implemented
    239         #  the raw0/3 datagrams are still parsed, you are just not saving them
--> 240         new_datagram = fid.read(1)
    242     except SimradEOF:
    243         break

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/utils/ek_raw_io.py:436, in RawSimradFile.read(self, k)
    434 if k == 1:
    435     try:
--> 436         return self._read_next_dgram()
    437     except Exception:
    438         if self.at_eof():

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/utils/ek_raw_io.py:346, in RawSimradFile._read_next_dgram(self)
    344     return raw_dgram
    345 else:
--> 346     nice_dgram = self._convert_raw_datagram(raw_dgram, bytes_read)
    347     self._current_dgram_offset += 1
    348     return nice_dgram

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/utils/ek_raw_io.py:375, in RawSimradFile._convert_raw_datagram(self, raw_datagram_string, bytes_read)
    369 except KeyError:
    370     # raise KeyError('Unknown datagram type %s,
    371     # valid types: %s' % (str(dgram_type),
    372     # str(self.DGRAM_TYPE_KEY.keys())))
    373     return raw_datagram_string
--> 375 nice_dgram = parser.from_string(raw_datagram_string, bytes_read)
    376 return nice_dgram

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/utils/ek_raw_parsers.py:80, in _SimradDatagramParser.from_string(self, raw_string, bytes_read)
     78     header = header.decode()
     79 id_, version = self.validate_data_header(header)
---> 80 return self._unpack_contents(raw_string, bytes_read, version=version)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/echopype/convert/utils/ek_raw_parsers.py:1559, in SimradRawParser._unpack_contents(self, raw_string, bytes_read, version)
   1557     data[field] = header_values[indx]
   1558     if isinstance(data[field], bytes):
-> 1559         data[field] = data[field].decode()
   1561 data["timestamp"] = nt_to_unix((data["low_date"], data["high_date"]))
   1562 data["bytes_read"] = bytes_read

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x90 in position 4: invalid start byte
leewujung commented 10 months ago

@merin-joseph : Thanks for reporting this! Could you point us to an example file so that we can look into this more closely?

leewujung commented 10 months ago

@merin-joseph : Following up to see if you can provide an example file?

leewujung commented 10 months ago

@merin-joseph : Following up to see if you can provide an example file?

merin-joseph commented 10 months ago

Thank you for following up. Here is the download link for the example file

Download link

leewujung commented 9 months ago

@praneethratna : Could you please take a look and see what might be the cause? Thank you!

praneethratna commented 9 months ago

@leewujung This seems like an encoding issue in python. I have made a PR with the fix for this issue, please do check that!

leewujung commented 9 months ago

Hey @merin-joseph : @praneethratna's #1215 would fix this issue, and that will be in the upcoming release (targeting this weekend - early next week). We are trying to decide if to include such a file into our test data, and would like to ask:

Thanks!

merin-joseph commented 9 months ago

@leewujung Thanks for the update.

leewujung commented 9 months ago

Thanks @merin-joseph ! We'll get the fix in #1215 merged and updated in the upcoming release very soon!