atong01 / conditional-flow-matching

TorchCFM: a Conditional Flow Matching library
https://arxiv.org/abs/2302.00482
MIT License
875 stars 60 forks source link

Generating time series #115

Open shyam1998 opened 1 month ago

shyam1998 commented 1 month ago

Hello, thanks for the wonderful package. I am trying to do time series anomaly detection using normalizing flows/flow matching. My idea is to use a sliding window to obtain multiple sub-sequences of the time series and reconstruct the time series at each step and then use the reconstruction error to detect anomalies (so it's unsupervised). But I am running into an issue trying to do this.

This is what I tried so far:

  1. I divided my time series into N windows of size 64

  2. Built a Convolutional LSTM autoencoder that takes in the input data, timestep embeddings (sinusoidal embedding as used in the example UNet model), and positional information of the windows (simply an integer value representing the position) and reconstructs the input sample

  3. Initialize a ConditionalFlowMatcher with a sigma of 0.0

  4. Train the autoencoder model by minimizing the MSE of vt and ut

    for epoch in range(20):
    for i, batch_data in enumerate(train_loader):
    optimizer.zero_grad()
    x1 = batch_data[0].unsqueeze(1).to(device)
    y = batch_data[1].to(device)
    # print(y.shape)
    x0 = torch.randn_like(x1)
    t, xt, ut = FM.sample_location_and_conditional_flow(x0, x1)
    vt = model(t, xt, y)
    print(vt.shape)
    print(ut.shape)
    loss = torch.mean((vt - ut) ** 2)
    loss.backward()
    optimizer.step()
    print(f"epoch: {epoch}, steps: {i}, loss: {loss.item():.4}", end="\r")
  5. Use the ODE solver to generate trajectories for the first 100 windows

    
    USE_TORCH_DIFFEQ = True
    batch_size = 100  # Adjust as needed
    time_series_length = 64  # Adjust to the length of your time series

initial_state = torch.randn(batch_size, 1, time_series_length, device=device) generated_class_list = torch.arange(batch_size, device=device)

with torch.no_grad(): if USE_TORCH_DIFFEQ: traj = torchdiffeq.odeint( lambda t, x: model.forward(t, x, generated_class_list), initial_state, torch.linspace(0, 1, 2, device=device), atol=1e-4, rtol=1e-4, method="dopri5", ) else: traj = node.trajectory( initial_state, t_span=torch.linspace(0, 1, 2, device=device), )



The problem is when I visualize the reconstructed inputs, it doesn't seem to resemble anything compared to the input. Not sure what I am doing wrong, I know the problem doesn't have anything to do with the autoencoder model because I tested it separately. Any help is much appreciated.

`plt.plot(traj[-1, :batch_size, 0, :].cpu()[10])`
Reconstructed sample:
![Untitled](https://github.com/atong01/conditional-flow-matching/assets/26413539/5c99830a-355d-41ca-9b26-9696e628cc85)

Input sample:

![Untitled-1](https://github.com/atong01/conditional-flow-matching/assets/26413539/75ae1de3-a2f2-484c-9a2e-77a923833de1)