Why do you change:
t = torch.sigmoid(torch.randn((bs,), device=accelerator.device))
to:
t = torch.tensor([timesteps[random.randint(0, 999)]]).to(accelerator.device)
I found that in the first version, the training time is always near 0.5, and in the second version t has much wider span
In the file https://github.com/XLabs-AI/x-flux/commit/3139864620fb268eabfb3fe8f50141a963982840#diff-dcf224c4e9bbbf1401dce3c8338c64162c858fb21a9e8625e828637896adfcddR255
Why do you change: t = torch.sigmoid(torch.randn((bs,), device=accelerator.device)) to: t = torch.tensor([timesteps[random.randint(0, 999)]]).to(accelerator.device)
I found that in the first version, the training time is always near 0.5, and in the second version t has much wider span