TVAESynthesizer Model Details and Parameters

My first question is related to the paper here. In the section 4.5 (TVAE model), you mention that the model outputs a joint distribution of 2Nc + Nd variables. you also mention in the equation (attached below) about two variables αbar i,j and αhat i,j. Can you please explain these a bit and also about the combined distribution (last line in the equation)?
I have observed that I cannot change the activation function in the TVAESynthesizer. Below is the snippet for the model params I could change (mentioned in sdv docs using synthesizer.get_parameters()).
- Do you have any reasoning for not allowing the change in activation function and for using the ones mentioned in the paper?
- l2scale-Regularization term default value is 1e-5. Can you please explain exactly the role of l2scale and how it effects the model?
- I see that loss_factor for the reconstruction error has default value of 2. The total loss = reconstruction_loss + kl_loss. Does kl_loss also has any scaling factor and how would that effect the training and total loss?
- The code line - synthesizer.get_loss_values() gives only the total loss, Is there a way I can track the reconstruction_loss and kl_loss separately?
- Why is that the batch_size always should be a multiple of 10 and not a number like 512 or 256 (which are generally used for training process) ?

I hope my questions are clear. Thanks in advance!

sdv-dev / SDV