worldbank / REaLTabFormer

A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
https://worldbank.github.io/REaLTabFormer/
MIT License
203 stars 23 forks source link

No train with sensitivity for Relational model? #10

Closed echatzikyriakidis closed 1 year ago

echatzikyriakidis commented 1 year ago

Hi @avsolatorio,

I see that train with sensitivity happens only in Tabular model type? Don't we use sensitivity when training a relational model? Why is that?

https://github.com/avsolatorio/REaLTabFormer/blob/bf1a38ef8f202372956ac57a363289c505967982/src/realtabformer/realtabformer.py#L456

avsolatorio commented 1 year ago

Hello @echatzikyriakidis, correct; the intrinsic sensitivity training still needs to be implemented in the relational model. Unlike the non-relational case where a precise data-copying metric, i.e., DCR is well defined, it is not the case for the relational data. However, if an analogous metric is available, it can easily be implemented in the system. In the meantime, you have to set train_size < 1 so that evaluation data will be held-out to detect overfitting and execute early stopping.

echatzikyriakidis commented 1 year ago

HI @avsolatorio!

Great, I assumed that too which is probably why _train_with_sensitivity() doesn't exist in the case of relational. Yes, I use train_size=0.8 for relational models and train_size=1 for tabular ones.

Thanks!