cositools / cosipy

The COSI high-level data analysis tools
Apache License 2.0
3 stars 16 forks source link

Update FullDetectorResponse.py for reading rsp file #69

Closed GallegoSav closed 8 months ago

GallegoSav commented 1 year ago

I modified the open class method in order to check if the response filename is .h5 or .rsp. Then it will call the method _open_h5 (the current used) or _open_rsp wich read the rsp file, create a .h5 file with the good structure.

israelmcmc commented 1 year ago

@GallegoSav Do you have an example on how to run this? Maybe your copy of the DetectorResponse.ipynb that you used to open PowerlawContinuum_index2.Medium.binnedimaging.imagingresponse.rsp ?

israelmcmc commented 10 months ago

I know that we're still working on this, but I run the line profiles to know that takes so long. 85% of the time is on these lines, as expected:

File: /Users/imartin5/software/cosipy/cosipy/response/FullDetectorResponse.py
Function: _open_rsp at line 128

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   259                                                   # read the rsp file and get the bin number and counts
   260         1     441000.0 441000.0      0.0          with gzip.open(filename, "rt") as file:
   261                                                        
   262         1    3637000.0 3637000.0      0.0              progress_bar = tqdm(file, total=nlines, desc="Progress", unit="line")
   263                                           
   264  26362000 45771167000.0   1736.3     15.7              for line in progress_bar:
   265                                           
   266                                                           
   267  26362000 15865711000.0    601.8      5.4                  line = line.split()
   268                                           
   269  26361990 12038971000.0    456.7      4.1                  if len(line) == 0:
   270        10       1000.0    100.0      0.0                      continue
   271                                           
   272  26361990 10202978000.0    387.0      3.5                  key = line[0]
   273                                           
   274  26361933 9335179000.0    354.1      3.2                  if key == 'RD':
   275                                           
   276  26361933 53721280000.0   2037.8     18.4                      b = np.array(line[1:-1], dtype=int)
   277  26361933 14832714000.0    562.7      5.1                      c = int(line[-1])
   278                                           
   279  26361933 35410388000.0   1343.2     12.1                      coords[:, sbin] = b
   280  26361933 12380386000.0    469.6      4.2                      data[sbin] = c
   281                                           
   282  26361933 10806356000.0    409.9      3.7                      sbin += 1
   283                                                               
   284  26361990 41117343000.0   1559.7     14.1                  progress_bar.update(1)

However, note that the progress bar takes about 1/3 of the time.

GallegoSav commented 9 months ago

Thanks for the profiling study @israelmcmc ! Indeed it was expected that this part is long but I was not expected 1/3 of the time for the progress bar. I added it just to see the reading progression but if it takes that much time I'll maybe replaced it by a classic iteration incrementation. Concerning the RAM however the critical part is in the lines dr = Histogram(axes, contents=data) and dr_area = dr * dr.expand_dims(counts2area, 'Ei')

israelmcmc commented 8 months ago

Thanks @GallegoSav. I think this is working, so I'll merge it so people can start to use during the workshop.

My main comment is to use the logging library instead of print() statements, but we can resolve that later. I'll open an issue about it.