Open mousphere opened 3 months ago
@mousphere,
What's the dtype of the binary file? Could you provide a bit more info about what the .dat
file is? You're sure it's headerless (ie despite the huge memory spike does the reader seem to work?)
One thing you could try would be to do the same at the rawio level. Have you used that before? I'm wondering if reshape is causing the huge memory spike.
If you test the rawio level and it still has the memory spike then I think I know how to fix it. We would have to slow the RawIO level down to protect the memory.
@zm711 The dtype is int. Even though the memory usage increased, the reader was working. When I newly set lazy=True in the read_block function, the rate of increase in memory usage slowed down, but it still used about 16GB (I stopped the process midway because the graph generation did not complete even after more than an hour).
Additionally, trying to plot in stages using chunks or using plt.subplots() instead of plt.plot() did not solve the issue. It might be that matplotlib is using a lot of memory during processing.
How do you know that is using that much memory? How are you measuring rss?
@h-mayorquin I checked it in the Mac Activity Monitor.
My guess is that the matplotlib is consuming many memory.
What is the memory consuption when doing only this ?
import neo
import numpy as np
data_file = 'example.dat'
nb_channel = 40
analog_digital_input_channels = 8
sampling_rate = 30000
reader = neo.io.RawBinarySignalIO(filename=data_file, nb_channel=nb_channel, sampling_rate=sampling_rate)
block = reader.read_block()
analog_signals = block.segments[0].analogsignals[0]
numpy_signal = analog_signals.magnitude[:, :nb_channel - analog_digital_input_channels]
I want to read data from a dat file using RawBinarySignalIO and plot the data. When I ran the program below, it consumed more than 32GB of memory for a 1GB dat file. I want to run this process on AWS Lambda, so I need it to execute with less than 10GB of memory. Is there a way to achieve this?