[Bug]: Flux masked training - tensor size issue

What happened?

It appears that somewhere between d7a4e73 and 13dbd21, masked training for Flux has stopped working when it tries to perform training steps (after initial sampling). Reverting back to the earlier mentioned commit works fine (last commit I was using prior to pulling from master).

I am using masked training with a probability and weight of zero.

What did you expect would happen?

Training works as per normal.

Relevant log output

Traceback (most recent call last):
  File "C:\OneTrainer\modules\ui\TrainUI.py", line 561, in __training_thread_function
    trainer.train()
  File "C:\OneTrainer\modules\trainer\GenericTrainer.py", line 676, in train
    loss = self.model_setup.calculate_loss(self.model, batch, model_output_data, self.config)
  File "C:\OneTrainer\modules\modelSetup\BaseFluxSetup.py", line 593, in calculate_loss
    return self._flow_matching_losses(
  File "C:\OneTrainer\modules\modelSetup\mixin\ModelSetupDiffusionLossMixin.py", line 311, in _flow_matching_losses
    losses = self.__masked_losses(batch, data, config)
  File "C:\OneTrainer\modules\modelSetup\mixin\ModelSetupDiffusionLossMixin.py", line 79, in __masked_losses
    losses += masked_losses(
  File "C:\OneTrainer\modules\util\loss\masked_loss.py", line 13, in masked_losses
    losses *= clamped_mask
RuntimeError: The size of tensor a (16) must match the size of tensor b (64) at non-singleton dimension 1

Output of `pip freeze`

No response

Nerogar / OneTrainer

[Bug]: Flux masked training - tensor size issue #597

What happened?

What did you expect would happen?

Relevant log output

Output of `pip freeze`

Nerogar / OneTrainer

[Bug]: Flux masked training - tensor size issue #597

What happened?

What did you expect would happen?

Relevant log output

Output of pip freeze

Output of `pip freeze`