JinghaoLu / MIN1PIPE

A MINiscope 1-photon-based Calcium Imaging Signal Extraction PIPEline.
GNU General Public License v3.0
59 stars 27 forks source link

Analysing data saved through UCLA Miniscope nuget in Bonsai #22

Closed aledifil closed 5 years ago

aledifil commented 5 years ago

Hi, I'm using the following configuration:

Anyway, when I try to analyse them with Min1pipe I have the following error:

min1pipe Begin collecting datasets info Done collecting datasets info, time: 2.9933 Begin data cat Unable to perform assignment because the left and right sides have a different number of elements.

Error in data_cat (line 174)                             stt(1) = stt1;

Error in min1pipe (line 83)             [m, filename_raw, imaxn, imeanf, pixh, pixw, nf] = data_cat(path_name, file_base{i},             file_fmt{i}, Fsi, Fsi_new, spatialr);

I was able to reconstruct that the root of the issue is in these lines of data_cat function:

168 h1 = strfind(headert, 'movi00db'); 169 dlen = headert(h1 + 8: h1 + 11); 170 ndframe = double(typecast(dlen, 'uint32')); 171 stt1 = h1 + 11; 172 idt = find(strcmp(dir_uset, dir_use{ib}{i})); 173 stt = zeros(nft(idt), 1); 174 stt(1) = stt1;

Specifically, the h1 variable is empty, because strfind does not find 'movi00db' in the headert vector of my files. Can you please explain what is this line of code actually doing? Thanks!

JinghaoLu commented 5 years ago

Hi, this is deep related to the data structure of avi file, where the string is the header of the video frame LIST stored in avi. There are just too many different ways of composing an avi file so that the headers, structures and all other things can change in the avi file. MIN1PIPE just takes the default format used by some main platforms (inscopix, ucla and .mat files) and it is not surprising that MIN1PIPE is not functioning robustly on "nonstandard" file structures. You are welcome to either switch the recording paradigm or write your own interface of data reading.

aledifil commented 5 years ago

Hi, thanks for your answer! Switching the recording paradigm could be a problem at this point. Writing our own interface is feasible but takes time. Do you have any specific info about what type of avi file the UCLA program is saving, so that I can make mine as similar as possible? I'm asking this because if I'm converting my files to be the same as UCLA ones when viewed in Matlab (in terms of cdata and colormap). Moreover, you are saying it is actually possible to instead run your pipeline directly on .mat files, am I correct? Because that may be another solution. Again, thanks!

JinghaoLu commented 5 years ago

I am not exactly sure about all the details regarding the format. I can only say that the format is some kind of vanilla uncompressed avi. One possible idea on the issue may be that the header length read in is not long enough to include the marker "movi00db", or it is not arranged in the same structure. You can try increasing hstep2 in line 166 of data_cat.m to see if there is "movi00db". The data saved in the avi file are the same as long as they are retrieved correctly, and the issue here is to correctly retrieve them.

If you want to use .mat format, you can rename the .mat file according to the same naming rule listed in the readme file on the git repo page, but since usually we have only one .mat file containing all the data, you can just name it whatever you want as long as there is no hyphen in the name. The only variable in the .mat file should be frame_all: height X width X number of frames.

aledifil commented 5 years ago

I actually followed your suggestion and increased hstep2 to 100.000. That seemed to work, but I have to be honest and confess I'm not really sure why: why should the header appear later? I tried looking for an explanation and checked the file type, and noticed that while UCLA files have "grayscale" as imageType, mine have "indexed". I'm creating the video by using Matlab videowriter and "Grayscale AVI" as the parameter, but apparently something is not working. Anyway, I suppose this may be a possible explanation for the difference.

I couldn't follow your suggestion about using .mat format, because my file is too big to be analyzed alltogether. I tried looking at the readme, but it is not clear whether it is still possible to separate a file in many concatenated ones, and what is the value to appear in those (I suppose it is a grayscale value).

JinghaoLu commented 5 years ago

This is what I consider as different data structures, and softwares can add whatever they want into the header. In matlab, you pay a price reading in the data so for header parse it is reasonable to start from a smaller hstep2, which in your case is too small.

For .mat file, yes you can separate the files, but if you already have a single .mat file, with only variable "frame_all" of type single (or double), of course grayscale (height X width per frame instead of height X width X 3), you can also skip the data_cat step if following these:

put the .avi file and .mat file in the same folder; name the .mat file: (name_of_avi)_frame_all.mat; run the package from the beginning and input "n" when you see from command line 'Overwrite raw .mat file (data)? (y/n)'.

aledifil commented 5 years ago

Great! thanks for your support. Now everything seems to work even with my videos, I was able to see some cells just now. Anyway, I will remember the .mat approach, just in case.

I would like to send you a picture of the final summary to hear your opinion about a strange result (single MC score with 100 value), but I think this is not the place. Where can I reach you?

JinghaoLu commented 5 years ago

You can write me email: min1pipe2018@gmail.com.

The issue you mentioned is due to the insufficient features in your two neighboring frames on which the KLT tracker cannot perform reliably. In such cases there is an extreme value assigned to it, but usually this does not imply the failure of movement correction, because this just happens during MC score estimation, and in fact the frames might have been movement corrected by log-demons.

ckemere commented 3 years ago

For future readers who might reach this issue. I wanted to generate a .mat file from my preprocessed data, and found that there's some stuff in data_info that changes the name of the file before sending it to data_cat. I believe it's looking for a date or somesuch, but it broke for me. If I just gave my .mat file a random name (i.e., scope.mat), and it had the object frames_all in it, things worked.