Open iamamiramine opened 3 months ago
Hi @iamamiramine sorry that I just saw your message. Did you solve it? The reason can be that your max_length is too small so that the generation cannot successfully generate one complete row of data.
Hello, I tried changing the max_length parameter and it did not work. Another thing to note is that Fake Hotel Guests dataset consists of 9 columns, so one row from this dataset is relatively short.
I am also having problems with this, I am using max_length=1024
the maximum, if use more I get a CUDA error, in this dataset I can not get a single sample:
from imblearn.datasets import fetch_datasets
sick = fetch_datasets()['sick']
sick.data.shape
Hi @omaralvarez @iamamiramine
You do not need to set the max_length to 1024 that big, you can uncomment this part of code to see what is the length of your encoded row:
Let me know if that helps.
Yes, I don't think it has to do with max_length, the issue in this case is that some numbers always are outside of the requested ranges in the predicted dataframe, so they are always filtered out. I have tried to switch temperature, k, and training epochs to no avail.
I am facing an issue when generating data using Tabula.
I trained Tabula on the following datasets:
However, when generating, the generation loop is stuck because generated data shape is always 0 (
num_samples
is always greater thangen_data.shape[0]
).I tried re-training, and tried changing the
max_length
parameter in the sampling function, but it was of no help.Can you please help me figure out how to fix this issue?