Closed EthanMarx closed 2 years ago
@alecgunny So i think I figured out the asd issue. It looks like its some precision issue when converting between arrays on the gpu and cpu (i.e. here in the dataloader
, and then here in the WhiteningTransform
.
As an example, the following
with h5py.File(hanford_background) as f:
data = f['hoft'][:]
ts_numpy = TimeSeries(data, dt = 1 / sample_rate)
asd_numpy = ts_numpy.asd(fftlength)
data_torch = torch.Tensor(data)
data_torch = data_torch.cpu().numpy()
ts_torch = TimeSeries(data_torch, dt = 1 / sample_rate)
asd_torch = ts_torch.asd(fftlength)
Will produce a valid asd for asd_numpy
, but the asd given by asd_torch
is similar to the ones we were seeing yesterday.
Replacing data_torch = torch.Tensor(data)
with data_torch = torch.tensor(data, dtype=torch.float64)
completely solves the discrepancy.
Oh wow I guess that makes sense given the scale of the data. So we need to make sure the background torch tensors don't get converted to fp32 before we fit the waveform generator?
Yes exactly. I think in general we should enforce torch.float64 explicitly. Do torch.Tensor
objects only refer to torch.float32
?
No you can specify dtype when you initialize, tensor = torch.Tensor(..., dtype=torch.float64)
The data immediately before entering the model looks funky. Current hypothesis is this is a result of the whitening step. Evaluate what is going wrong between data generation and passing through the model.