wfield preprocess gives different results after lossless compression

mayofaulkner commented 2 years ago

Hi Joao,

To save space in the IBL pipeline we compress the .dat file with the widefield to a .mov file using the labcams.io.stack_to_mj2_lossless function. I have found that running the wfield preprocess pipeline before and after compression yields different results.

I think I have sourced the problem and it is due to the way that the video stack is being read when multiple interleaved channels are used nchannels=2, the data for the first and second channel are identical when they shouldn't be.

Below is an example snippet

In [1]: from pathlib import Path

In [2]: from wfield import load_stack

In [3]: session_path = Path('/mnt/s0/Data/Subjects/FD_01/2022-08-04/001/raw_widefield_data')

In [4]: dat_mov = load_stack(str(session_path), nchannels=2)

In [5]: type(dat_mov)
Out[5]: wfield.io.VideoStack

In [6]: dat_dat = load_stack(str(session_path.joinpath('pco.edge_run000_00000000_2_540_640_uint16.dat')), nchannels=2)

In [7]: type(dat_dat)
Out[7]: numpy.memmap

In [8]: dat_mov[0]
Out[8]:
array([[[[   0,    0,    0, ..., 1490, 1476, 1457],
         [1402, 1370, 1354, ..., 1387, 1390, 1357],
         [1406, 1409, 1360, ..., 1397, 1401, 1409],
         ...,
         [1433, 1426, 1513, ..., 1489, 1510, 1484],
         [1443, 1493, 1449, ..., 1479, 1484, 1492],
         [1615, 1513, 1558, ..., 1651, 1604, 1578]],

        [[   0,    0,    0, ..., 1490, 1476, 1457],
         [1402, 1370, 1354, ..., 1387, 1390, 1357],
         [1406, 1409, 1360, ..., 1397, 1401, 1409],
         ...,
         [1433, 1426, 1513, ..., 1489, 1510, 1484],
         [1443, 1493, 1449, ..., 1479, 1484, 1492],
         [1615, 1513, 1558, ..., 1651, 1604, 1578]]]], dtype=uint16)

In [9]: dat_dat[0]
Out[9]:
memmap([[[   0,    0,    0, ..., 1490, 1476, 1457],
         [1402, 1370, 1354, ..., 1387, 1390, 1357],
         [1406, 1409, 1360, ..., 1397, 1401, 1409],
         ...,
         [1433, 1426, 1513, ..., 1489, 1510, 1484],
         [1443, 1493, 1449, ..., 1479, 1484, 1492],
         [1615, 1513, 1558, ..., 1651, 1604, 1578]],

        [[   0,    0,    0, ..., 1530, 1514, 1473],
         [1410, 1366, 1374, ..., 1504, 1457, 1409],
         [1362, 1401, 1404, ..., 1466, 1444, 1464],
         ...,
         [1715, 1668, 1722, ..., 1774, 1713, 1776],
         [1599, 1726, 1749, ..., 1683, 1734, 1692],
         [1903, 1749, 1766, ..., 1865, 1779, 1766]]], dtype=uint16)

Would this be something that could be fixed as it means that it isn't possible to replicate the same preprocessing pipeline once the video is compressed and the original data deleted.

Many thanks!

jcouto commented 2 years ago

Thanks for this @mayofaulkner!! It is fixed in the dev branch, commit 96e43fc. Let me know if you have issues with it. I'll do some more testing on it and create a pip release.

mayofaulkner commented 2 years ago

Hey, thanks for looking at this so quickly.

I'm now getting this error

Input In [5], in <cell line: 1>()
----> 1 dat[0]

File ~/Documents/PYTHON/envs/iblenv/lib/python3.8/site-packages/wfield/io.py:431, in GenericStack.__getitem__(self, squeeze, *args)
    429 img = np.empty((len(idx1),*self.dims),dtype = self.dtype)
    430 for i,ind in enumerate(idx1):
--> 431     img[i] = self._get_frame(ind)
    432 if not idx2 is None:
    433     if squeeze:

File ~/Documents/PYTHON/envs/iblenv/lib/python3.8/site-packages/wfield/io.py:786, in VideoStack._get_frame(self, frame)
    784 fileidx,frameidx = self._get_frame_index(frame)
    785 if not fileidx == self.current_fileidx:
--> 786     self._load_substack(fileidx,frameidx)
    787 elif not frameidx == self.current_frameidx:
    788     self._load_substack(fileidx,frameidx)

File ~/Documents/PYTHON/envs/iblenv/lib/python3.8/site-packages/wfield/io.py:768, in VideoStack._load_substack(self, fileidx, frameidx)
    766     if not self.pix_fmt in ['yuv420p']:
    767         outputdict = inputdict
--> 768 if not -'pix_fmt' in outputdict:
    769     outputdict = {'-pix_fmt':'gray'}
    770         # can't handle 3 channel color right now.

TypeError: bad operand type for unary -: 'str'

Looks like it is needs to be '-pix_fmt' rather than -'pix_fmt'

mayofaulkner commented 2 years ago

There is also one more thing that I noticed if you don't mind fixing (sorry I tried to create a branch but I don't have permissions). In the hemocorrect function in this line https://github.com/jcouto/wfield/blob/master/wfield/cli.py#L546, I think it should be changed to t[np.mod(functional_channel,2)::2] to make sure the correct timepoints are interpolated when the functional channel is not 0.

Many thanks!

jcouto commented 2 years ago

Thank you! I just changed those in dev. You should have permissions now though.

mayofaulkner commented 2 years ago

Just one last small fix which I have made here #10 and now it is all working perfectly. Thank you so much for your help!

jcouto commented 2 years ago

Thank you so much @mayofaulkner!!

jcouto / wfield

wfield preprocess gives different results after lossless compression #9