worldbank / REaLTabFormer

A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
https://worldbank.github.io/REaLTabFormer/
MIT License
203 stars 23 forks source link

Track the column size and the number of digits in numerical fields for the transformation of `seed_input` #2

Open avsolatorio opened 1 year ago

avsolatorio commented 1 year ago

Note the leading zero indicating that the total number of columns is more than 9. But since we are not tracking this, the moment we use the seed_input argument, the transformation only infers from the given data and not from the data used during training.

image image

The same is true in this case. The transformation should note that the hhsize variable has values that may exceed 10, so it should truncate the leading 0 as shown in the second image.

image image