NeuralAnalysis / PyalData

Repository for the Python implementation of the TrialData analysis library.
GNU General Public License v3.0
7 stars 9 forks source link

[Question] Why is restrict_to_interval inclusive on both ends of the specified slice? #129

Open raeedcho opened 2 years ago

raeedcho commented 2 years ago

I'm a little bit confused as to why the restrict_to_interval function goes against Python convention to make the slice it pulls out inclusive on both the beginning and the end.

Currently, it's throwing an error for me when I try to slice from my idx_goCueTime to idx_endTime (which for my data structure points to the end of the trial--e.g. td.loc[1,'M1_spikes'].shape[0] returns 11359 and td.loc[1,'idx_endTime'] returns 11359). Right now, restrict_to_interval tells me that the idx_endTime is outside the trial range, unless I add a rel_end=-1 to the function arguments (which I can do, but doesn't really seem to make much sense to me).

Similarly, I always feel like I'm off by 1 on the number of bins I ask for when I use the function.

I'm sure there was a compelling reason to make it inclusive on the end--does anyone know what it was?

bagibence commented 2 years ago

I'm not sure when and why we made that decision, it might have just been my personal preference. You're right that in a way it goes against the convention, but it feels more natural to me that if I want an interval that stops just before a given time point, I indicate that with a -1. Regarding your example: I would assume idx_endtime to be a valid index into the underlying array, which is not true if it's the same as its length. (Are you adjusting the indices when loading the .mat file?) If it points to the last index, then including it when using restrict_to_interval feels right to me.

For future reference if we decide to change this: the +1 at the end is "hidden" in slice_around_index, slice_around_point, and slice_between_points.

raeedcho commented 2 years ago

I see--that logic does make sense to me in a way. Possibly the explanation should go in the docstring of the function, just to explain the rationale for going against convention?

I think the places where it becomes less intuitive for me is when I use rel_start and rel_end, e.g. when I write

pyaldata.restrict_to_interval(...,start_point_name='idx_goCueTime',rel_start=-40,rel_end=40)

Intuitively, I would kind of expect that to give me 80 bins, but possibly I just need to get used to counting the extra bin it gives me.

As for my previous example, you make a good point--idx_endTime should refer to a valid index in the array, and I'm not entirely sure why it doesn't... I definitely shift the indices back (using True as the second argument of mat2dataframe). I'll have to investigate more.