MaimonLab / SiffPy

Code for fast analysis of ScanImage .tiffs and .siffs
GNU General Public License v3.0
4 stars 1 forks source link

What's the most efficient way to get metadata, e.g. recording start and stop, from a file? #13

Open MaximilianHoffmann opened 3 months ago

MaximilianHoffmann commented 3 months ago

Right now I am getting the recording start and end like:

sr=SiffReader(os.path.join(p,f))
                try:
                    sr.time_lims=[datetime.fromtimestamp(float(x)/1e9) for x in list(sr.get_time([sr.all_frames[0],sr.all_frames[-1]],'epoch'))]
                  ...

but it's kind of slow. Is there a faster way?

MaximilianHoffmann commented 3 months ago

Actually...I think it's alright. It seems like there was again an issue with file transfer locking up some files

%%timeit
p='/mnt/fast/Data_Imaging/img/20240626_fly2_1.siff'
sr=SiffReader(p)
sr.time_lims=[datetime.fromtimestamp(float(x)/1e9) for x in list(sr.get_time([sr.all_frames[0],sr.all_frames[-1]],'epoch'))]
760 ms ± 9.16 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

It's not super-fast, though, this is a 50 GB file on a local SSD drive and if you'd have 100 files like this it would take a substantial time, right?

StephenThornquist commented 3 months ago

Hmmm I think that's the fastest way of what's currently implemented with SiffPy. The slowest part is the file opening, so maybe what we can consider is a quick scan that doesn't do as much importing to find the first and last timestamp? The problem is you have to scan each IFD sequentially to find out where the end is... Here are some lazy benchmarks with a file I happened to have open to demonstrate where the latencies are coming from. Maybe I can memoize the all_frames property to speed things up too.

Screenshot 2024-07-04 at 12 54 05 PM