open-ephys / analysis-tools

Archived code for reading data saved by the Open Ephys GUI
59 stars 176 forks source link

load_open_ephys_data_faster.m memory error #88

Open nmtimme opened 3 years ago

nmtimme commented 3 years ago

Hello

I've noticed that load_open_ephys_data_faster.m requires a lot more memory to load .continuous files in Matlab 2020a. Specifically, I'm getting a memory error on line 84:

hdr = fread(fid, NUM_HEADER_BYTES, 'char*1');

when I have about 4 GB of memory left for a .continuous file that takes up only about 1 GB of space when it is loaded (and obviously the header is much smaller than the full 1 GB for all the data). In Matlab 2016, this was not a problem.

I found this post on the mathworks site that indicates something changed with fread in Matlab2020a that might explain this behavior:

https://www.mathworks.com/matlabcentral/answers/84385-out-of-memory-error-while-reading-binary-file

I'd appreciate it if any necessary updates could be made to load_open_ephys_data_faster.m so it will work properly with Matlab2020a, or if someone could provide suggestions on how to modify the code. I don't know much about the structure of the .continuous files, so I don't feel confident making this change myself. Thank you for all your help!

~Nick

saman-abbaspoor commented 3 years ago

I have the same problem with load_open_ephys_binary. We had a recording of 128 electrodes for less than 30 minutes today and I am trying to load it into Matlab 2018b but was not able to due to memory issues. I also checked it with Matlab 2017b and got the same error.

Error using fread
Out of memory. Type HELP MEMORY for your options.

Error in load_open_ephys_binary (line 96)
            D.Data=fread(file,[header.num_channels Inf],'int16');

I have no issues viewing the data offline in the OpenEphys gui and also got a shorter chunk of it to try it in Matlab and could easily open it. But it cannot open the whole file which is 7gb. Is there any solution to this?

aacuevas commented 3 years ago

Hi,

@nmtimme which specific error are you getting? It is weird that the error is happening while reading the short header. However, the script by default scales the data by bitVolts to return it in uV instead of raw values. This makes Matlab convert it to double (its native format) which occupies 4 times as much. You can try avoiding this by using 'unscaledInt16' as last argument to the function (run help load_open_ephys_data_faster for more information)

@saman-abbaspoor Although the binary reader does not perform any conversion ans returns raw values, it does convert them to double, so your 7GB file would become 28GB in memory. We might fix this, after ensuring that it might not break any existing scripts. In the meanwhile, you could apply this fix yourself by changing 'int16' to 'int16=>int16' in load_open_ephys_binary.m line 96

However, the binary format offers a better solution for large data sets, which is highly recommended. You can add an optional 'mmap' argument to the call to load the file in memory mapped mode. In this mode, the file is not loaded whole into MATLAB, but can be accessed in a chunked way, with MATLAB only loading the parts he needs on each step. Please consult the documentation with help load_open_ephys_binary for more information.

Aarón.

saman-abbaspoor commented 3 years ago

Thank you for your response Aaron,

When I load data in memory mapped mode, data structure is in int16 instead of double that is the normal case for the load_open_ephys_binary. Should I convert the data into double? and if yes, should I easily do x = double(x)?

SAM

aacuevas commented 3 years ago

You should convert the data, yes, to ensure that any floating point operations you do are performed correctly. Keep in mind that you cannot overwrite the data structure, as it is an object that dynamically accesses the data. Instead you should do y=double(x), where x is the slice of the data you need. This makes a copy on memory, so it's best to do that when operating and storing the already transformed results, like y=double(D.Data.Data.mapped(1,1:1024))/bitVolts;

Best, Aarón

nmtimme commented 3 years ago

@aacuevas Sorry for the slow response! I was getting a standard "Out of memory." error at that line. I tried to reproduce it today, but now I'm getting the error at a different line:

Out of memory.

Error in load_open_ephys_data_faster/segRead (line 182) seg = fread(fid, numIdxdblock(segNum).Repeat, sprintf('%d%s', ...

Error in load_open_ephys_data_faster (line 141) data = segRead('data', 'b') .* info.header.bitVolts;

So, perhaps it isn't an issue with the specific line I referenced in my original question.

I've looked at my total memory usage and it looks like I should have about 4 GB of free memory when I call load_open_ephys_data_faster. I'm trying to load a single channel that was recorded at 30 kHz for a little over an hour (115678208 samples to be precise). That whole channel should only take up about 1 GB when stored even as a double, so I should have enough space to load it as a double. So, it seems like load_open_ephys_data_faster is requesting a lot more space than it needs, for some reason. Perhaps it is related to the issue discussed in the link I had in the original post?

Thanks for all your help!

~Nick