Closed mef51 closed 3 years ago
As a workaround I've been extracting data from the pdf figure source that is on the arxiv listing with some success but this is not ideal..
Hi! Thank you for catching those typos! I will fix them in the README.
The bursts are too dim to see in individual narrow channels, that are 400/16,384 MHz wide. In the paper we downsampled the data in frequency by a factor 256 to only 64 frequency channels with 400/64 MHz bandwidth each so that the bursts are visible by eye in the dynamic spectra. We uploaded the waterfalls at the full resolution so that you can decide yourself what resolution works best for the analysis that you are interested in. I hope that helps!
-- Ziggy
Hi Ziggy, thanks! I had tried downsampling but I realized after your message I was just decimating, so I tried it again and got it to work. Would it be possible to add an example of the downsampling to the README? I had to do some other minor gymnastics to handle the masked values when downsampling so I'll share what I did in case someone else has similar trouble:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors as mcolors
import os
import skimage.measure
folder = 'data/CHIME_180916.J0158+65'
burstfiles = [s for s in os.listdir(folder) if '16k' in s]
plt.figure(figsize=(24,24))
cmap = plt.get_cmap('gray_r')
cmap.set_bad(color = 'w', alpha = 1.)
for fname in burstfiles:
data = np.load("{}/{}".format(folder, fname))
wfall = data["wfall"]
downsample = 256
# nan_to_num before downsampling. note that this produces some streaks in the bursts..
wfall_down = skimage.measure.block_reduce(np.nan_to_num(wfall), block_size=(downsample, 1), func=np.mean)
wfall_masked = np.ma.masked_where(wfall_down == 0, wfall_down)
# print(wfall_down.shape, wfall.shape)
dt_s = data["dt_s"]
center_freq_mhz = data["center_freq_mhz"]
df_mhz = center_freq_mhz[1] - center_freq_mhz[0]
# print(dt_s, center_freq_mhz[0], center_freq_mhz[-1], df_mhz)
plt.subplot(5,5,burstfiles.index(fname)+1)
plt.title(fname)
plt.imshow(wfall_masked, origin="lower", aspect="auto", interpolation="nearest", cmap=cmap,
extent=(0, 1000*dt_s*wfall.shape[1], center_freq_mhz[0]-df_mhz/2., center_freq_mhz[-1]+df_mhz/2.))
plt.colorbar()
plt.xlabel("Time [ms]")
plt.ylabel("Frequency [MHz]")
plt.tight_layout()
which produces
@mef51
I would encourage you to add your script to the cfod
package here via a Pull Request.
Hello! Sorry to open this issue again, but I'm having trouble seeing the bursts from FRB180814.J0422+73 (from "A Second Repeating Fast Radio Burst", Figure 2), even after downsampling and using the weights read from cfod
.
I've tried saturating the intensity scale, zooming in the time-axis, averaging the beams, etc..
This is what I'm seeing:
import numpy as np
import matplotlib.pyplot as plt
import os
import skimage.measure
folder = 'data/CHIME_FRB180814.J0422+73'
burstdirs = ['burst180814', 'burst180911', 'burst180919']
for burst in burstdirs:
burstfiles = [s for s in os.listdir(folder +'/'+ burst) if '.npy' in s and 'weights' not in s]
weightfiles = [s for s in os.listdir(folder +'/'+ burst) if '.npy' in s and 'weights' in s]
plt.figure(figsize=(20,24))
plt.suptitle(burst)
cmap = plt.get_cmap('gray_r')
cmap.set_bad(color = 'w', alpha = 1.)
dataAveraged = []
for fname, wfname in zip(burstfiles, weightfiles):
try:
del data
except Exception:
pass
data = np.load("{}/{}/{}".format(folder, burst, fname))
weights = np.load("{}/{}/{}".format(folder, burst, wfname))
tl, tr = 425, 575
data = data[:, :]
weights = weights[:, :]
np.putmask(data, ~weights.astype(bool), np.nan)
data = data - 1*data[:, 0:25].mean(axis=1)[:,None]
downsample = 256
data = skimage.measure.block_reduce(data, block_size=(downsample, 1), func=np.nanmean)
if dataAveraged == []:
dataAveraged = data
else:
dataAveraged += data
plt.subplot(4, 4, burstfiles.index(fname)+1)
plt.title(burst + ": beam" + fname.split('.')[0].split('beam')[1])
plt.imshow(data[:,:], origin="lower", aspect="auto", interpolation="nearest", cmap=cmap)
plt.clim(-10, 10)
plt.colorbar()
plt.xlabel("Time [channel #]")
plt.ylabel("Frequency [channel #]")
plt.tight_layout()
if burst == 'burst180814': plt.savefig('bursts180814.png')
Thanks, Mohammed
Hi Mohammed, it looks to me like there are a lot of frequency channels that are contaminated by radio frequency interference (RFI) still in your image. Those noisy channels also seem to dominate the color scale. The CHIME telescope for example sees the LTE cellular network between about 730 and 755 MHz very prominently (roughly your channels 52--56) and there was evidently also a significant source of RFI at the time of the observation around your channels 8--11. Can you try to mask all of those channels? In general, you get the best results (i.e., you save the most data) if you do the masking at the full 16,384-channel resolution, before averaging over channels. I hope that helps! Best, -- Ziggy
Hi Ziggy, I've zapped a few more channels but this hasn't brought the burst forward. The data cutout spans 1 second, so I thought maybe I'm too zoomed out on the time axis to see it? But even zooming in on 128ms chunks of the data I can't find it, whether I look at the beams individually or a beam-averaged waterfall. The figure below is for the Aug. 14 burst but its the same problem with the other bursts. Ultimately I want to take autocorrelations of the bursts (like in Josephy et al. 2019 and Hessels et al. 2019), which will need a waterfall that has high enough SNR.. I think I'm out of ideas
Cheers, Moh
edit: i also have more noisy channels around 450MHz than shown in fig 1 of the paper, which is also where the burst lives
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors as mcolors
import os
import skimage.measure
folder = 'data/CHIME_FRB180814.J0422+73'
# burstdirs = ['burst180814', 'burst180911', 'burst180917', 'burst180919']
# burstdirs = ['burst180814', 'burst180911', 'burst180919']
burstdirs = ['burst180814']
timestep = 128
res = { #(tres, fres)
'burst180814': (0.983, 16384/400),
'burst180911': (0.983, 16384/400),
'burst180917': (1.966, 16384/400),
'burst180919': (0.983, 16384/400),
}
for twindow in range(0, 7):
for burst in burstdirs:
burstfiles = [s for s in os.listdir(folder +'/'+ burst) if '.npy' in s and 'weights' not in s]
weightfiles = [s for s in os.listdir(folder +'/'+ burst) if '.npy' in s and 'weights' in s]
tl, tr = timestep*twindow, timestep*(twindow+1) - 1
plt.figure(figsize=(20,24))
plt.figure(figsize=(10,10))
#plt.suptitle(burst)
cmap = plt.get_cmap('gray_r')
cmap.set_bad(color = 'w', alpha = 1.)
dataAveraged = []
for fname, wfname in zip(burstfiles, weightfiles):
try:
del data
except Exception:
pass
data = np.load("{}/{}/{}".format(folder, burst, fname))
weights = np.load("{}/{}/{}".format(folder, burst, wfname))
#tl, tr = 425, 575
data = data[:, :]
weights = weights[:, :]
weights[13312:14592] = 0
weights[11776:12032] = 0
weights[2060:2850] = 0
weights[768:1024] = 0
np.putmask(data, ~weights.astype(bool), np.nan)
data = data - 1*data[:, 0:200].mean(axis=1)[:,None]
downsample = 256
data = skimage.measure.block_reduce(data, block_size=(downsample, 1), func=np.nanmean)
if dataAveraged == []:
dataAveraged = data
else:
dataAveraged += data
plt.subplot(4, 4, burstfiles.index(fname)+1)
plt.title(burst + ": beam" + fname.split('.')[0].split('beam')[1])
plt.imshow(data[:,:], origin="lower", aspect="auto", interpolation="nearest", cmap=cmap, extent=(0, res[burst][0], 400, 800))
plt.colorbar()
plt.xlabel("Time [s]")
plt.ylabel("Frequency [MHz]")
# Beam averaged
# plt.title(burst + " beam averaged," + " time chunk {}:{}".format(tl, tr))
# plt.imshow(dataAveraged[:,tl:tr]/len(burstfiles), origin="lower", aspect="auto", interpolation="nearest", cmap=cmap, extent=(0, res[burst][0], 400, 800))
# plt.colorbar()
plt.tight_layout()
Hi Mohammed,
Sorry for not picking up on this earlier, but unlike the data from "Periodic activity from a fast radio burst source" the bursts from this paper (stored in the .msgpack files) are not yet dedispersed (they are at DM=0 pc cm-3) and every .msgpack file is a separate chunk of 1024 time samples. To get the full chunk that encompasses the bursts you need to open all of the files for that burst at once, e.g.:
import glob
from cfod import chime_intensity as ci
fns = glob.glob("*.msgpack")
intensity, weights, fpga0s, fpgaNs, binning, rfi_mask, frame0_nanos = ci.unpack_datafiles(fns)
Then you need to dedisperse the intensity data to the DM of the burst/source (ideally at full frequency resolution before averaging frequency channels). Only in the dedispersed dynamic spectrum will the burst stand out.
Nice that you are planning to run an autocorrelation on the bursts. What information are you ultimately trying to get out of them, if I may ask?
Best, -- Ziggy
Hi Ziggy, okay I see. So if I understand correctly the signal is dispersed over a 16 second interval? I was initially using unpack_datafiles
will all the files together but I was getting the MemoryError below so I started going one by one. If I can't get around this memory error somehow with cfod
then I'll try stacking them horizontally myself before dedispersing.
Traceback (most recent call last):
File "numpyifier.py", line 15, in <module>
intensity, weights, fpga0s, fpgaNs, binning, rfi_mask, frame0_nanos = ci.unpack_datafiles(fns)
File "b:\dev\chime-frb-open-data\cfod\chime_intensity.py", line 138, in unpack_datafiles
fn
File "b:\dev\chime-frb-open-data\cfod\chime_intensity.py", line 81, in unpack_data
intensity, weights = chunk.decode()
File "b:\dev\chime-frb-open-data\cfod\assemble_chunk.py", line 102, in decode
self.nt_per_packet, axis=1
MemoryError: Unable to allocate 64.0 MiB for an array with shape (16384, 1024) and data type float32
We were intrigued by the linear relationship between the burst drift rate and frequency that was shown in fig. 6 of Josephy et al. 2019 and have been exploring some ideas. Its easier to say something meaningful about the drift rates of bursts from repeater sources since the DM is better constrained, pulse trains are more common, etc..
Thanks, Moh
Hi Mohammed,
The DM delay from 800 to 400 MHz for DM=190 pc cm-3 (this source) is about 3.7 seconds, so my suspicion is that for all of the bursts from this source the full burst is in the last four msgpack files of the sets; so you could try reading in only the last four files for each burst. I hope that works!
I agree that the ~linear evolution of drift rate with frequency is very interesting! Feel free to email me if you want to discuss drift rates or drift rate measurements in more detail.
Best, -- Ziggy
Thanks for your help! I'm still getting that memory error but it must be something weird with my system. And thanks for the offer, I will probably take you up on that :)
cheers, Mohammed
Hi, I've downloaded the data for FRB 180916.J0158+65 from here but I'm unable to see the bursts as shown in Extended Data figure 1 of the paper following the code that is provided in the README of the data release. In addition the zapped channels due to RFI appear differently than those published.
Code provided by README:
There are some small typos in the code provided (
df
should bedf_mhz
, and the time axis extent should be0, dt_s*wfall.shape[1]
notwfall.shape[0]
) but with those fixed, the 25 waterfall plots i see are shown below. As you can see for example, Bursts 18 and 19, which are some of the more obvious bursts in the paper are completely obscured by zapped channels:For reference the code for the above figure is:
Any help would be appreciated, thanks.