CHIMEFRB / fitburst

An open-source package of utilities for direct modeling of radio dynamic spectra.
https://chimefrb.github.io/fitburst/
MIT License
11 stars 2 forks source link

Extracting width per frequency channel (VLBI) #64

Open tcassanelli opened 2 years ago

tcassanelli commented 2 years ago

The purpose of this feature is to extract the width per frequency channel, giving an start and stop timestamp. Then this information will be used within the VLBI pipeline and applied to the outrigger sites. Applying this custom mask to each data set will reduce the noise and improve the correlation strength. The idea is to run fitburst instantly once an event has been triggered and received by the outrigger sites, optimize DM and share these set of parameters to the outriggers sites (or perhaps a common site?). Then apply mask and correlate.

SebastianManosalva commented 1 year ago

Right now I am using the crab pulsar to test fitburst. The code for the individual plots is: `import matplotlib.pyplot as plt %matplotlib inline

plt.imshow( spectrum_downsampled[idx_good_freq], origin = "lower", aspect="auto", interpolation="nearest", extent=[min_time, max_time, min_freq, max_freq]) plt.xlabel("Time [ms]") plt.ylabel("frequency [MHz]")`

The original signal: crab pulse original

The bestfit model: bestfit model crab pulse

SebastianManosalva commented 1 year ago

To extract the initial time from the h5 file I am using:

`from astropy.time import Time data = BBData.from_file(path) data_time = data["time0"]

start_time = Time( val=data_time[:]['ctime'], val2=data_time[:]['ctime_offset'], format="unix", precision=9 )`

For the final time an I_cut value is used to know where the pulse ends.

`from sklearn import preprocessing import numpy as np

fit_copy = np.copy(bestfit) #where bestfit is the model obtained from fitburst

normalize

fit_copy = preprocessing.normalize(fit_copy)

I_cut = 1e-4 #this value was used as an example pulse_time =[] pulso = np.greater(fit_copy, I_cut) final_time = [] for j in range(bestfit.shape[0]):

for i in range(len(bestfit[j])):
    if pulso[j][i] == True:
        pulse_time.append(times_windowed[i])
final_time.append(pulse_time[-1])

now with the start and final time the width can be obtained `

SebastianManosalva commented 1 year ago

Here I show the functions that allow us to calculate the width, start time of the pulse and final time of the pulse per frequency channel:

`def t_dd(DM, t_0, freq_0, freq_1): k = 4.148e3 t_delta = k DM (1/(freq_0)-1/(freq_1)) t_1 = t_delta + t_0 return t_1

def get_initial_and_final_time(bestfit, fname, path, ratio): """ input:

def get_start_time(h5_file, initial_time, final_time, DM): """ input:

def get_widths(bestfit, fname, path, ratio, h5_file, DM): initial_time, final_time = get_initial_and_final_time(bestfit, fname, path, ratio) width, startTime, finishTime = get_start_time(h5_file, initial_time, final_time, DM)

#maybe I can save some files in this function or some plots
return width, startTime, finishTime`
SebastianManosalva commented 1 year ago

Here is how I use the get_widths function:

`import numpy as np import h5py import matplotlib.pyplot as plt %matplotlib inline import h5py from astropy.time import Time from astropy import units as apu from sklearn import preprocessing from fitburst.backend.generic import DataReader

fit = np.load("bestfit_fitburst_crabpulse.npz")

fit.files

bestfit = fit["arr_0"] fname = "fitburst_crabpulse.npz" path = "/home/smanosal/documentos" ratio = 5*10e-3 h5_file = "/data/user-data/cassanelli/Cs/chime/singlebeam_172516423_gain_20210603T130623.h5" DM = 56.755

width, startTime, finishTime = get_widths(bestfit, fname, path, ratio, h5_file, DM)`

SebastianManosalva commented 1 year ago

And this is a plot using the initial_time and final_time from get_initial_and_final_time

Width crab pulse

tcassanelli commented 1 year ago

@SebastianManosalva good result. First thing is, I'd like to know the start_time of the file and the number of frames for the position of the red and orange lines, if we have that then we are good. Can you test a bit more with the algorithm to select more of the pulse at lower frequencies? I see that I good portion of the signal is being lost at the end. Also can you plot those lines in the actual data? I think that will tell us whether your method is working or not.

SebastianManosalva commented 1 year ago

Here is what I have done since the last update: I created a function that encapsulates all the processes of fitburst and my own code, the function receive the .npz file, a ratio and a save path. It returns the right and left limits of the pulse, the start_time per frequency channel (astropy.Time), the DM and the TOA. But I am not sure if the TOA 's are correct, because they are far away from the pulse: image In green is the TOA's and orange/red are the limits. I am looking what could be the issue.

tcassanelli commented 1 year ago

hmm I doubt that all the top section was taken out. I think it is a plotting error, since those missing channels should be spread out across the entire band and not just at the top!

SebastianManosalva commented 1 year ago

The first funtion of make_input is called cut_profile it returns the frequency channels with the name freq and this frequencies goes from 400.390625 to 656.25, the other channels are then filled with zeros. So it isn't a plotting error.

SebastianManosalva commented 1 year ago

I had been testing the code with a new event singlebeam_173568547 and with different arrival times:

  1. First the TOA that I get from make_input image
  2. Second the average time between the two limits: image
  3. Third the TOA from baseband_analysis/analysis/toa/get_TOA image
  4. Fourth the fixed TOA from run_fitburst (this one has two values) image image

As I confirmed with Ziggy and Ketan the arrival time is always calculated for the 400.1953125 MHz band and between all the arrival times the last one looks really good! But the code changed just a bit compared to the last time, so maybe the singlebeam_173098290 fit had something strange ...

SebastianManosalva commented 1 year ago

I have been trying different codes to get make_input to work as intended. The next function improves the data (coherent and incoherent dedispersion) and refines the DM.

` from baseband_analysis.core.bbdata import BBData from baseband_analysis.core.signal import _get_profile, get_floor, get_main_peak_lim, tiedbeam_baseband_to_power from baseband_analysis.analysis.snr import get_snr from fitburst.routines.profile import get_signal, count_components

def simplified_cut_profile(path, DM, t_res, DM_range = 1, w = None, downsample = 1, refine_RFI = True, fill_missing_time = None, fill_nan = True, spectrum_lim = False, diagnostic_plots = False, valid_channels=None):

data = BBData.from_file(path)
#This function gets the power data in case the original data doesn't have it.
tiedbeam_baseband_to_power(data=data, dm=DM, time_downsample_factor=1, dedisperse=True, time_shift=False) 
#Optimize the DM, incoherent dedisp
freq_id, freq, power, offset, weight, valid_channels, time_range, DM, downsampling_factor = get_snr(data=data, DM=DM, diagnostic_plots=diagnostic_plots, w=w, valid_channels=valid_channels, spectrum_lim=spectrum_lim, DM_range=DM_range, downsample=downsample, refine_RFI=refine_RFI, fill_missing_time=fill_missing_time, fill_nan=fill_nan,return_full=True)
# get the two dimensional profile and cut the data 
profile = _get_profile(power)
start, end = get_signal(profile, ds = downsampling_factor)
# Find peaks in the profile
peaks = count_components(profile, t_res, ds = downsampling_factor, diagnostic_plots=diagnostic_plots)
peak_times = peaks * t_res
DM_err = 0.05/2

return data, freq_id, freq, power[...,start:end], valid_channels, DM, DM_err, downsampling_factor, profile, t_res, peak_times, start, power

`

tcassanelli commented 1 year ago

ok, can you please post and update with a figure as well.