iskandr / fancyimpute

Multivariate imputation and matrix completion algorithms implemented in Python
Apache License 2.0
1.25k stars 178 forks source link

AttributeError: 'KNN' object has no attribute 'complete' #81

Closed hamsterLee closed 5 years ago

hamsterLee commented 6 years ago

THIS IS MY ERROR dataset.dtypes dataset.isnull().sum() hour=364324 test_cl=dataset[0:hour] train_cl=dataset[hour:] train.isnull().sum() test.isnull().sum()

Xcol =dataset.columns Xcol=Xcol.drop('aqhi') Ycol='aqhi' X = train_cl.loc[:, Xcol] Y = train_cl.loc[:, Ycol] def standardize(s): return s.sub(s.min()).div((s.max() - s.min()))

Xnorm = X.apply(standardize, axis=0) kvals = np.linspace(1, 100, 20, dtype='int64')

knn_errs = [] for k in kvals: knn_err = [] Xknn = KNN(k=k, verbose=False).complete(Xnorm) knn_err = cross_val_score(rf, Xknn, Y, cv=24, n_jobs=-1).mean()

knn_errs.append(knn_err)
print("[KNN] Estimated RF Test Error (n = {}, k = {}, 10-fold CV): {}".format(len(Xknn), k, np.mean(knn_err)))

sns.set_style("darkgrid") plt.plot(kvals, knn_errs) plt.xlabel('K') plt.ylabel('10-fold CV Error Rate')

knn_err = max(knn_errs) k_opt = kvals[knn_errs.index(knn_err)]

Xknn = KNN(k=k_opt, verbose=False).complete(Xnorm) Yknn = Y

print("[BEST KNN] Estimated RF Test Error (n = {}, k = {}, 10-fold CV): {}".format(len(Xknn), k_opt, np.mean(knn_err))) Traceback (most recent call last):

File "", line 39, in Xknn = KNN(k=k, verbose=False).complete(Xnorm)

AttributeError: 'KNN' object has no attribute 'complete'

sergeyf commented 6 years ago

Yup, the most recent version is using .fit_transform like sklearn. Try that?

On Thu, Oct 18, 2018, 6:05 PM leecheukwing notifications@github.com wrote:

THIS IS MY CODE for k in kvals: knn_err = [] Xknn = KNN(k=k, verbose=False).complete(Xnorm) knn_err = cross_val_score(rf, Xknn, Y, cv=24, n_jobs=-1).mean()

knn_errs.append(knn_err) print("[KNN] Estimated RF Test Error (n = {}, k = {}, 10-fold CV): {}".format(len(Xknn), k, np.mean(knn_err)))

sns.set_style("darkgrid") plt.plot(kvals, knn_errs) plt.xlabel('K') plt.ylabel('10-fold CV Error Rate')

knn_err = max(knn_errs) k_opt = kvals[knn_errs.index(knn_err)]

Xknn = KNN(k=k_opt, verbose=False).complete(Xnorm) Yknn = Y

print("[BEST KNN] Estimated RF Test Error (n = {}, k = {}, 10-fold CV): {}".format(len(Xknn), k_opt, np.mean(knn_err)))

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/iskandr/fancyimpute/issues/81#issuecomment-431212132, or mute the thread https://github.com/notifications/unsubscribe-auth/ABya7Eba3Yu3cRwJrIB4ts3bnEZG1-Lgks5umSVugaJpZM4XvfcC .

hamsterLee commented 6 years ago

...: X_train = train_cl.loc[:, Xcol] Y_train = train_cl.loc[:, Ycol] X_test = test_cl.loc[:, Xcol] Y_test = test_cl.loc[:, Ycol]

X_filled_ii = IterativeImputer().fit_transform(X_train)

X_filled_knn = KNN(k=16).fit_transform(X_train)

X_filled_nnm = NuclearNormMinimization().fit_transform(X_train)

X_incomplete_normalized = BiScaler().fit_transform(X_train) X_filled_softimpute = SoftImpute().fit_transform(X_incomplete_normalized)

ii_mse = ((X_filled_ii[missing_mask] - X[missing_mask]) ** 2).mean() print("Iterative Imputer norm minimization MSE: %f" % ii_mse)

nnm_mse = ((X_filled_nnm[missing_mask] - X[missing_mask]) ** 2).mean() print("Nuclear norm minimization MSE: %f" % nnm_mse)

softImpute_mse = ((X_filled_softimpute[missing_mask] - X[missing_mask]) ** 2).mean() print("SoftImpute MSE: %f" % softImpute_mse)

knn_mse = ((X_filled_knn[missing_mask] - X[missing_mask]) ** 2).mean() print("knnImpute MSE: %f" % knn_mse) fancyimpute\solver.py:58: UserWarning: Input matrix is not missing any values warnings.warn("Input matrix is not missing any values") Traceback (most recent call last):

File "", line 34, in X_filled_knn = KNN(k=16).fit_transform(X_train)

File "fancyimpute\solver.py", line 189, in fit_transform X_result = self.solve(X_filled, missing_mask)

File "knn.py", line 103, in solve print_interval=self.print_interval)

File "few_observed_entries.py", line 51, in knn_impute_few_observed knn_initialize(X, missing_mask, verbose=verbose)

File "knnimpute\common.py", line 39, in knn_initialize D = all_pairs_normalized_distances(X_row_major)

File "normalized_distance.py", line 38, in all_pairs_normalized_distances D = np.ones((n_rows, n_rows), dtype="float32", order="C") * np.inf

File "core\numeric.py", line 203, in ones a = empty(shape, dtype, order)

MemoryError

sergeyf commented 6 years ago

How big is your data? Looks like it is too big for knn imputation.

If you have a train/test type of setup, I would just use IterativeImputer. It's the only one that can work transductively.

hamsterLee commented 6 years ago

In my csv file, I have 618385 lines. df = pd.read_csv(data.csv') X_incomplete=np.array(df.to_records().view(type=np.matrix))

Model each feature with missing values as a function of other features, and

use that estimate for imputation.

X_filled_ii = IterativeImputer().fit_transform(X_incomplete)

sergeyf commented 6 years ago

I don't know what your error means. Which line is it happening in?

hamsterLee commented 6 years ago

I change my csv file to array:X_incomplete=np.array(df.to_records().view(type=np.matrix))

sergeyf commented 6 years ago

Sorry, not sure.

On Sat, Oct 20, 2018, 8:10 PM hamsterLee notifications@github.com wrote:

I change my csv file to array:X_incomplete=np.array(df.to_records().view(type=np.matrix))

Sent from Mail for Windows 10

From: Sergey Feldman Sent: Sunday, October 21, 2018 1:04 AM To: iskandr/fancyimpute Cc: hamsterLee; Author Subject: Re: [iskandr/fancyimpute] AttributeError: 'KNN' object has noattribute 'complete' (#81)

I don't know what your error means. Which line is it happening in? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/iskandr/fancyimpute/issues/81#issuecomment-431635205, or mute the thread https://github.com/notifications/unsubscribe-auth/ABya7OH5k-fs7VYDaqWXlX2dX8Beh9Kaks5um-WJgaJpZM4XvfcC .

hamsterLee commented 6 years ago

I have other problem : Why i from fancyimpute import MICE, it said ImportError: cannot import name 'MICE'. Could i solve this problem?

sergeyf commented 6 years ago

It's called IterativeImputer now.

On Tue, Oct 23, 2018, 7:42 AM hamsterLee notifications@github.com wrote:

I have other problem : Why i from fancyimpute import MICE, it said ImportError: cannot import name 'MICE'. Could i solve this problem?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/iskandr/fancyimpute/issues/81#issuecomment-432274171, or mute the thread https://github.com/notifications/unsubscribe-auth/ABya7GFgT3iVAtZbg6wAcP7OSbNUhvKkks5unyrvgaJpZM4XvfcC .

nkolhe123 commented 5 years ago

In one of my python imputation problem i faced the same error AttributeError: 'KNN' object has no attribute 'complete' replacing ".complete" with ".fit_transform" resolved issue

sergeyf commented 5 years ago

Yes, we have stopped using .complete and now use .fit_transform everywhere to match sklearn api.