gwastro / pycbc

Core package to analyze gravitational-wave data, find signals, and study their parameters. This package was used in the first direct detection of gravitational waves (GW150914), and is used in the ongoing analysis of LIGO/Virgo data.
http://pycbc.org
GNU General Public License v3.0
307 stars 345 forks source link

how do you input own data into pycbc_inference? #4647

Open Tobias-Reike opened 4 months ago

Tobias-Reike commented 4 months ago

Hello, i want to run pycbc_inference with some events i generated myself with pycbc. What do i need to do in the [data] section to load my own data? I want to compare the performance of my neural network to the performance of an mcmc. For that i want the input to the NN and mcmc to be exactly the same. I assume i need some combination of frame-files/hdf-store and channel-names corresponding to my data. Currently i have some data.hdf with a 'strain' dataset, but i can expand this if necessary. I cant really find any clear directions/documentations explaining this.

ahnitz commented 4 months ago

@Tobias-Reike Save your data as a gwf file and you can give as a frame file input to pycbc inference. You can use the pycbc.frame.write_frame function to output a frame file with a pycbc TimeSeries as input (it may need to be padded up to a second boundary).

When you write the frame file (see the help message of the function) you'll assign the channel name, etc.

Tobias-Reike commented 4 months ago

@ahnitz thanks for the answer. Ive worked around this so far, by just letting pycbc_inference generate similar events. However now im trying to implement this. As you pointed out it only seems to start sampling if i pad my array with a lot of zeros on both sides. Why do i need to do this and will it affect my samples, if i do not include the padded part in my analysis start/end time or my psd start/end time. For an example i have 9s of data. Ive set my analysis start/end time and my psd start/end time to -8s/1s. But my timeseries actually goes from -18s to 11s seconds with zeros outside the analysis time. This way pycbc_inference is generating samples (its not finished yet), but it feels wrong to do this.