UCBerkeleySETI / blimpy

Breakthrough Listen I/O Methods for Python
https://blimpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
46 stars 40 forks source link

MeerKAT filterbank cannot be processed by this function #201

Closed ZaynAmell closed 3 years ago

ZaynAmell commented 3 years ago

when running FindDoppler() function with MeerKAT filterbank files, the following function raises error:

https://github.com/UCBerkeleySETI/blimpy/blob/27cf97601bbe744c9efb3da24e1bfeb874b3295a/blimpy/io/base_reader.py#L232


ValueError                                Traceback (most recent call last)
<ipython-input-7-3cab86e7a19e> in <module>
----> 1 fdop = FindDoppler('/datax/scratch/jzhang/20200917_guppi_59143_54557_000456_J1939-6342_offset_0001.rawspec.0000.fil')
      2 fdop.search()

~/.local/lib/python3.7/site-packages/turbo_seti/find_doppler/find_doppler.py in __init__(self, datafile, max_drift, min_drift, snr, out_dir, coarse_chans, obs_info, flagging, n_coarse_chan, kernels, gpu_backend, precision, append_output, log_level_int)
     97                                       n_coarse_chan=n_coarse_chan,
     98                                       coarse_chans=coarse_chans,
---> 99                                       kernels=self.kernels)
    100         if (self.data_handle is None) or (self.data_handle.status is False):
    101             raise IOError("File error, aborting...")

~/.local/lib/python3.7/site-packages/turbo_seti/find_doppler/data_handler.py in __init__(self, filename, out_dir, n_coarse_chan, coarse_chans, kernels, gpu_backend, precision)
     81 
     82             # Split the file
---> 83             self.data_list = self.__split_h5()
     84             self.status = True
     85 

~/.local/lib/python3.7/site-packages/turbo_seti/find_doppler/data_handler.py in __split_h5(self)
    147             n_coarse_chan = fil_file.header['n_coarse_chan']
    148         else:
--> 149             n_coarse_chan = int(fil_file.calc_n_coarse_chan())
    150 
    151         # Only load coarse chans of interest -- or do all if not specified

/opt/conda/lib/python3.7/site-packages/blimpy/waterfall.py in calc_n_coarse_chan(self, chan_bw)
    272         """
    273 
--> 274         n_coarse_chan = self.container.calc_n_coarse_chan(chan_bw)
    275 
    276         return n_coarse_chan

/opt/conda/lib/python3.7/site-packages/blimpy/io/base_reader.py in calc_n_coarse_chan(self, chan_bw)
    281             errmsg2 = "In turbo_seti, you can specify n_course_chan explicitly."
    282             logger.error(errmsg2)
--> 283             raise ValueError(errmsg1)
    284 
    285     def calc_n_blobs(self, blob_dim):

ValueError: blimpy:io:base_reader:calc_n_coarse_chan: not hires and not GBT.

Modification needs to be made

david-macmahon commented 3 years ago

I guess it is possible to specify the value you want turbo_seti to use for n_coarse_chan (as errmsg2 from line 281 shows). You just need to figure out: A) how to do actually specify n_coarse_chan to turbo set and B) what value to actually use for the files you are working with

texadactyl commented 3 years ago

Waterfall.info():

--- File Info ---
      machine_id :                               20
    telescope_id :                               -1
         src_raj :                      19:41:17.65
         src_dej :                      -57:40:05.2
        az_start :                              0.0
        za_start :                              0.0
       data_type :                                1
            fch1 :                     1511.375 MHz
            foff :                  0.208984375 MHz
          nchans :                             3712
          nbeams :                                1
           ibeam :                               -1
           nbits :                               32
   tstart (ISOT) :          2020-10-21T15:09:17.000
    tstart (MJD) :                59143.63144675926
           tsamp :              0.00489988785046729
            nifs :                                1
     source_name :                J1939-6342_offset
     rawdatafile : guppi_59143_54557_000456_J1939-6342_offset_0001.0000.raw

Num ints in file :                             2032
      File shape :                  (2032, 1, 3712)
--- Selection Info ---
Data selection shape :                  (2032, 1, 3712)
Minimum freq (MHz) :                         1511.375
Maximum freq (MHz) :                   2286.916015625
texadactyl commented 3 years ago

I believe that a telescope ID = -1 is invalid. What is the correct value for the Meerkat telescope ID?

Currently, function blimpy:io:base_reader:calc_n_coarse_chan() is quite limited and out of date. I have highlighted this 2 times in the past as needing a review by a radio astronomer.

Agree with @david-macmahon - just supply a coarse channel number on input to FindDoppler(n_coarse_chan=N, .....) to bypass this.

telegraphic commented 3 years ago

Unfortunately the concept of 'coarse channels' is not supported by the filterbank file format. And the file format was set in stone several decades before MeerKAT existed -- when an integer ID for a telescope, with no official database of ID <--> telescope existed -- seemed like a good idea to save space...

I am sure someone doing pulsars with MeerKAT has assigned a numerical ID for filterbank files. We should try and match. I'll delegate this fact-finding mission to @danielczech or @david-macmahon, with a todo:

I agree @texadactyl that the heuristics are rubbish in calc_coarse_chan. I can think of two passable options: 1) Default to 1 coarse channel and print a warning 2) Default to as close to ~1M fine channels per coarse coarse channel, calculate, tell the user and raise a warning that it's pretty much arbitary.

texadactyl commented 3 years ago

Note:

Closing this one and opening an issue in turbo_seti.

texadactyl commented 3 years ago

@telegraphic Agreed to "Default to 1 coarse channel and print a warning". Updating blimpy io/base_reader.py.