kathrinse / be_great

A novel approach for synthesizing tabular data using pretrained large language models
MIT License
276 stars 46 forks source link

Breaking the generation loop #31

Closed Travisma2233 closed 1 year ago

Travisma2233 commented 1 year ago

Is there any good way to encounter this situation besides increasing the epoch? My computer performance is limited, and increasing the epoch will greatly prolong the calculation time. 火柴截图20230714160907778

Madnex commented 1 year ago

Did you try to increase the max_lengthparameter? The default is 100. Depending on the number of features you should increase it quite a bit (e.g. 400). I had the same issue and increasing the parameter worked. If it is too low the model will never generate all the features, hence the loop (trying to generate a new complete sample each iteration).

Travisma2233 commented 1 year ago

Ok, thank you for your reply, I currently have 14 features, how many features do you have, and what max length do you think is appropriate for 14 features? In addition, since I don't know enough about this code, can I modify it in great.py? Thanks again for your help

Madnex commented 1 year ago

You can just set the parameter in the sample method: samples = great.sample(n_samples, k=50, max_length=400

See the documentation.

Easiest is to just try out a few values and see if it can generate samples (and how fast).