Open gloglo17 opened 1 year ago
Hi,
Not sure if you're still encountering this issue. I tried checking out your dataset, but couldn't access it. Are there enough samples in data? Another option if working with small sample sizes is to decrease batch_size significantly in the fit method.
Hope this helps.
- Problem: I get this error if the dataset has 41 or fewer rows. There is no error when the data set is 42 or higher!
- Fix: The Following change in batch_size fixes this problem: default is 32 search.fit(x=X_train, y=y_train, verbose=0, epochs=10, batch_size=12)
- Details: Autokeras 1.1, Code, and passing/failing data sets are attached.
url = 'auto-insurance_41a.csv' dataframe = read_csv(url, header=None) print(dataframe.shape)
data = dataframe.values data = data.astype('float32') X, y = data[:, :-1], data[:, -1] print(X.shape, y.shape)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1) search = StructuredDataRegressor(max_trials=15, loss='mean_absolute_error') search.fit(x=X_train, y=y_train, verbose=0, epochs=10)
Reloading Tuner from ./structured_data_regressor/tuner0.json
ValueError Traceback (most recent call last)
2 frames
/usr/local/lib/python3.10/dist-packages/autokeras/tasks/structured_data.py in fit(self, x, y, epochs, callbacks, validation_split, validation_data, **kwargs) 137 self.check_in_fit(x) 138 --> 139 history = super().fit( 140 x=x, 141 y=y,
/usr/local/lib/python3.10/dist-packages/autokeras/auto_model.py in fit(self, x, y, batch_size, epochs, callbacks, validation_split, validation_data, verbose, **kwargs) 286 # Split the data with validation_split. 287 if validation_data is None and validation_split: --> 288 dataset, validation_data = data_utils.split_dataset( 289 dataset, validation_split 290 )
/usr/local/lib/python3.10/dist-packages/autokeras/utils/data_utils.py in split_dataset(dataset, validation_split) 44 numinstances = dataset.reduce(np.int64(0), lambda x, : x + 1).numpy() 45 if num_instances < 2: ---> 46 raise ValueError( 47 "The dataset should at least contain 2 batches to be split." 48 )
ValueError: The dataset should at least contain 2 batches to be split.
Hi, I'm just starting with Autokeras, trying out the tutorial example with my dataset but Autokeras doesn't even start. I get this error:
ValueError: The dataset should at least contain 2 batches to be split.
Python 3.10.7, Autokeras 1.1.0, Keras 2.12.0, Tensorflow 2.12.0, Pandas 1.5.3, Numpy 1.23.5
Here's the whole code including link to download the dataset.