uwmadison-chm / bioread

Utilities to work with files from BIOPAC's AcqKnowlege software
MIT License
65 stars 23 forks source link

Python int too large to convert to C int #37

Closed Jean-Diddy closed 2 years ago

Jean-Diddy commented 2 years ago

Hello, When I use the code for one of my new data study file:

import bioread
signals = bioread.read_file("filename.acq")

I get this issue:

OverflowError                             Traceback (most recent call last)
<ipython-input-25-4599cf1bc1b9> in <module>
----> 2 signals = bioread.read_file("filename.acq) 

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\__init__.py in read(filelike, channel_indexes)
     24     target_chunk_size:  A guide for the number of bytes to read at a time.
     25     """
---> 26     return reader.Reader.read(filelike, channel_indexes).datafile
     27 
     28 

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\reader.py in read(cls, fo, channel_indexes, target_chunk_size)
     83         with open_or_yield(fo, 'rb') as io:
     84             reader = cls(io)
---> 85             reader._read_headers()
     86             reader._read_data(channel_indexes, target_chunk_size)
     87         return reader

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\reader.py in _read_headers(self)
    173         # data_length is 0 for compressed files.
    174         self.marker_start_offset = (self.data_start_offset + self.data_length)
--> 175         self._read_markers()
    176         try:
    177             self._read_journal()

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\reader.py in _read_markers(self)
    302             self.marker_start_offset, mh_class)
    303         self.datafile.marker_header = self.marker_header
--> 304         self.__read_marker_items(mih_class)
    305 
    306     def __read_marker_items(self, marker_item_header_class):

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\reader.py in __read_marker_items(self, marker_item_header_class)
    325                 channel=marker_channel,
    326                 date_created_ms=mih.date_created_ms,
--> 327                 type_code=mih.type_code))
    328         self.marker_item_headers = marker_item_headers
    329         self.datafile.marker_item_headers = marker_item_headers

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\biopac.py in __init__(self, sample_index, text, channel_number, channel, date_created_ms, type_code)
    278         if date_created_ms is not None:
    279             self.date_created_utc = (
--> 280                 REF_DATE + timedelta(milliseconds=date_created_ms)
    281             )
    282         self.type_code = type_code

OverflowError: Python int too large to convert to C int

Versions : bioread 2.1.3 Jupyter Notebook 6.2.0 Python 3.6.12 Windows 10

What should I do to be able to read this file correctly ? It is the first time it happens, I managed to read other .acq files correctly before. Thank you

njvack commented 2 years ago

I'm gonna guess there's something about that file that bioread doesn't understand. Can you share the file?

Jean-Diddy commented 2 years ago

@nvjack thank you very much for your reply. I cannot share the files because of confidentiallity purposes. However, I can tell you that the only difference between the recordings I managed to open and the one I didn't manage to open is the version of the acqknowledge used for the recording. 4.0.0 for the file I don't manage to open with bioread and 4.1.0 for the files I manage to open with bioread. Can this be helpful ?

njvack commented 2 years ago

Not really. There are files that come along and have some surprise or other in them. Unless someone can sit down for a few hours with the source code and the data file and a hex editor, it'll be hard to figure out what's wrong.

One thing you can do if you have a copy of AcqKnowledge handy is to open the file and save it in 3.8 format.

njvack commented 2 years ago

huh, though, it's choking on the markers..... should be able to read the file without reading the markers. That seems like a reasonable thing to do, doesn't it?

Try this, maybe:

from bioread import reader
signals = reader.Reader("filename.acq")
try:
    signals.read_headers()  # This will throw an overflowerror
except OverflowError:
    print("there was an error but maybe we can go on")
signals._read_data(None)

at this point, signals might have what you're looking for.

It's super gross, but it might work — the markers come after the data so if it's just messing up on the stuff that comes after the data, it might be okay.

I should really add a skip_extras option or something that will stop reading headers as soon as it has enough information to find and decode the channel data...

Jean-Diddy commented 2 years ago

Yes, thank you for your help even if I can't share the file. Yes it seems reasonnable so I tried.

from bioread import reader
signals = reader.Reader("D:/Documents/Travail Scolaire/THESE/Data/TEST/avadis_01.acq")
try:
    signals.read_headers("D:/Documents/Travail Scolaire/THESE/Data/TEST/avadis_01.acq")  # This will throw an overflowerror
except OverflowError:
    print("there was an error but maybe we can go on")
signals._read_data(None)

But I got:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-20-2f7c6c7b8e8b> in <module>
      8 except OverflowError:
      9     print("there was an error but maybe we can go on")
---> 10 signals._read_data(None)

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\reader.py in _read_data(self, channel_indexes, target_chunk_size)
    284 
    285     def _read_data(self, channel_indexes, target_chunk_size=CHUNK_SIZE):
--> 286         if self.is_compressed:
    287             self.__read_data_compressed(channel_indexes)
    288         else:

~\anaconda3\envs\stage_3A\lib\site-packages\bioread\reader.py in is_compressed(self)
    112     @property
    113     def is_compressed(self):
--> 114         return self.graph_header.compressed
    115 
    116     def _read_headers(self):

AttributeError: 'NoneType' object has no attribute 'compressed'

Can it be because of a problem of saving .acq format depending on if it is a Graph format or something else ? Thank you a lot

njvack commented 2 years ago

Maybe? But... I'm probably at the end of the debugging I can do without seeing the file. And even then, I don't have much time to dedicate to support... 🙃

You can get more logging, I think, if you were to:

import logging
from bioread import reader
reader.logger.setLevel(logging.DEBUG)

# Now try to read

but it's not going to help unless you really want to delve into the file reading code.

Your best bet, by far, is going to be feeding the file into a copy of Acqknowledge and resaving it in an earlier format.

Jean-Diddy commented 2 years ago

Thank you very much for your help. The issue was actually the kind of file. I opened the file with an other biopac software version and saved it again. I managed to read it with bioread with this saving version... Thanks @njvack for your kind help !

njvack commented 2 years ago

Glad to help, and sorry we couldn't read the original!