kathrinse / be_great

A novel approach for synthesizing tabular data using pretrained large language models
MIT License
276 stars 46 forks source link

next GReaT version (fix for the datasets bug, "train it longer" message) #15

Closed unnir closed 1 year ago

unnir commented 1 year ago

Hi @kathrinse,

I fixed the bug caused by the new datasets package version. By adding a new method to the GReaTDataset class, I tested and it worked. https://github.com/kathrinse/be_great/blob/10a6a2aa73819e5e67281c32fe84da36665a17f5/be_great/great_dataset.py#L42

We should update the pip then, but maybe before we also should include a message on "train it longer", for cases there GReaT does not generate all the features.

Depending on your availability, please do it, I just created the issue so we won't forget :)