kylemath / mne-python

MNE : Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
http://mne.tools
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Read data numpy #2

Closed kylemath closed 4 years ago

kylemath commented 4 years ago

Remove pandas from jon

kylemath commented 4 years ago

ok, I don't think this solution is correct, you shouldn't have to pass the data into the object raw_extras since the other function should open it and load the data (that is the point of the other function)

so the main init function should open the file once, get all the info for the detector names, number of points, which line to start and end on, etc. Then it should put that info in the raw_extras that it needs

then the _read_segment_file should load the same file again, and this time ignore all that other stuff and just parse the data

note below that read_segment_file in the nirx.py code gets called inside the code with a start and stop, it gets this from when the object is created based on the which lines it should start and stop on:

     super(RawNIRX, self).__init__(
            info, preload, filenames=[fname], last_samps=[last_sample],
            raw_extras=[raw_extras], verbose=verbose)

This code gets run only when the data is loaded, it takes the raw object and finds all the (2) file names in the folder it should load, then it loads both of them into wls, all it gets is the data, not the markers, or channel info, and stores that in data, it doesn't get this data from a variable in the Raw object, since it shouldn't be loaded yet when the raw object is created,

    def _read_segment_file(self, data, idx, fi, start, stop, cals, mult):
        """Read a segment of data from a file.

        The NIRX machine records raw data as two different wavelengths.
        The returned data interleaves the wavelengths.
        """
        sdindex = self._raw_extras[fi]['sd_index']
        nchan = self._raw_extras[fi]['orig_nchan']

        wls = [
            _read_csv_rows_cols(
                self._raw_extras[fi]['files'][key],
                start, stop, sdindex, nchan // 2).T
            for key in ('wl1', 'wl2')
        ]

        # TODO: Make this more efficient by only indexing above what we need.
        # For now let's just construct the full data matrix and index.
        # Interleave wavelength 1 and 2 to match channel names:
        this_data = np.zeros((len(wls[0]) * 2, stop - start))
        this_data[0::2, :] = wls[0]
        this_data[1::2, :] = wls[1]
        data[:] = this_data[idx]

        return data

def _read_csv_rows_cols(fname, start, stop, cols, n_cols):
    # The following is equivalent to:
    # x = pandas.read_csv(fname, header=None, usecols=cols, skiprows=start,
    #                     nrows=stop - start, delimiter=' ')
    # But does not require Pandas, and is hopefully fast enough, as the
    # reading should be done in C (CPython), as should the conversion to float
    # (NumPy).
    x = np.zeros((stop - start, n_cols))
    with _open(fname) as fid:
        for li, line in enumerate(fid):
            if li >= start:
                if li >= stop:
                    break
                x[li - start] = np.array(line.split(), float)[cols]
    return x