TrickyPells / cLASpy_T

Program to classify remote sensing data (LAS, CSV) based on Scikit-Learn
https://claspy-t-project.readthedocs.io
Other
12 stars 1 forks source link

Estimator failed / Nan created #57

Closed laurent124 closed 2 years ago

laurent124 commented 2 years ago

[laurent@hulk cLASpy_T-Formation-CNRS]$ source .venv/claspy_venv/bin/activate (claspy_venv) [laurent@hulk cLASpy_T-Formation-CNRS]$ python cLASpy_T.py train --train_r=0.1 -a=gb -i=/home/laurent/machine/xavier_orne/1305251Mpts.las -f=['Anisotropy(5)','Anisotropy(10)','Anisotropy(25)','Anisotropy(50)','Eigenentropy(5)','Eigenentropy(10)','Eigenentropy(25)','Eigenentropy_(50)','Eigenvaluessum(5)','Eigenvaluessum(10)','Eigenvaluessum(25)','Eigenvaluessum(50)','linearity(5)','linearity(10)','linearity(25)','linearity(50)','Omnivariance(5)','Omnivariance(10)','Omnivariance(25)','Omnivariance(50)','PCA1(5)','PCA1(10)','PCA1(25)','PCA1(50)','PCA2(5)','PCA2(10)','PCA2(25)','PCA2(50)','Planarity(5)','Planarity(10)','Planarity(25)','Planarity(50)','Roughness(5)','Roughness(10)','Roughness(25)','Roughness(50)','Sphericity(5)','Sphericity(10)','Sphericity(25)','Sphericity(50)','Surfacevariation(5)','Surfacevariation(10)','Surfacevariation(25)','Surfacevariation(50)','Verticality(5)','Verticality(10)','Verticality(25)','Verticality(50)','Verticality(5)','Verticality(10)','Verticality(25)','Verticality(50)'] -p="{'n_estimators':100,'max_depth':20,'min_samples_leaf':100}"

####### POINT CLOUD CLASSIFICATION ####### Algorithm used: GradientBoostingClassifier Path to LAS file: /home/laurent/machine/xavier_orne/130525_1Mpts.las

Create a new folder to store the result files... Folder already exists.

Step 1/7: Formatting data as pandas.Dataframe... LAS Version: 1.2 LAS point format: 1 Number of points: 1,000,000 /home/laurent/machine/cLASpy_T-Formation-CNRS/common.py:276: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy() frame[dim] = las.get_reader().get_dimension(dim)

Get selected features:

Number of wanted features: 52 Number of final selected features: 52

--> All required features are present!

/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/pandas/core/generic.py:6392: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy return self._update_inplace(result) /home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/pandas/core/frame.py:5177: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy downcast=downcast,

Step 2/7: Splitting data in train and test sets... Random_state to split data: 0 Number of used points: 1 000 000 pts Size of train|test datasets: 100 000 pts | 900 000 pts

Step 3/7: Scaling data...

Step 4/7: Training model with cross validation...

Random_state for the StratifiedShuffleSplit: 0 [Parallel(n_jobs=-1)]: Using backend LokyBackend with 24 concurrent workers. /home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py:619: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py", line 598, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/pipeline.py", line 346, in fit self._final_estimator.fit(Xt, y, fit_params_last_step) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/ensemble/_gb.py", line 413, in fit dtype=DTYPE, multi_output=True) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/base.py", line 433, in _validate_data X, y = check_X_y(X, y, check_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(*args, *kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 878, in check_X_y estimator=estimator) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(args, kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 721, in check_array allow_nan=force_all_finite == 'allow-nan') File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 106, in _assert_all_finite msg_dtype if msg_dtype is not None else X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

FitFailedWarning) [CV] END .................................................... total time= 0.2s /home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py:619: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py", line 598, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/pipeline.py", line 346, in fit self._final_estimator.fit(Xt, y, fit_params_last_step) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/ensemble/_gb.py", line 413, in fit dtype=DTYPE, multi_output=True) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/base.py", line 433, in _validate_data X, y = check_X_y(X, y, check_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(*args, *kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 878, in check_X_y estimator=estimator) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(args, kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 721, in check_array allow_nan=force_all_finite == 'allow-nan') File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 106, in _assert_all_finite msg_dtype if msg_dtype is not None else X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

FitFailedWarning) [CV] END .................................................... total time= 0.2s /home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py:619: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py", line 598, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/pipeline.py", line 346, in fit self._final_estimator.fit(Xt, y, fit_params_last_step) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/ensemble/_gb.py", line 413, in fit dtype=DTYPE, multi_output=True) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/base.py", line 433, in _validate_data X, y = check_X_y(X, y, check_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(*args, *kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 878, in check_X_y estimator=estimator) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(args, kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 721, in check_array allow_nan=force_all_finite == 'allow-nan') File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 106, in _assert_all_finite msg_dtype if msg_dtype is not None else X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

FitFailedWarning) [CV] END .................................................... total time= 0.2s [Parallel(n_jobs=-1)]: Done 3 out of 5 | elapsed: 2.1s remaining: 1.4s /home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py:619: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py", line 598, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/pipeline.py", line 346, in fit self._final_estimator.fit(Xt, y, fit_params_last_step) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/ensemble/_gb.py", line 413, in fit dtype=DTYPE, multi_output=True) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/base.py", line 433, in _validate_data X, y = check_X_y(X, y, check_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(*args, *kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 878, in check_X_y estimator=estimator) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(args, kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 721, in check_array allow_nan=force_all_finite == 'allow-nan') File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 106, in _assert_all_finite msg_dtype if msg_dtype is not None else X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

FitFailedWarning) [CV] END .................................................... total time= 0.2s /home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py:619: FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan. Details: Traceback (most recent call last): File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/model_selection/_validation.py", line 598, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/pipeline.py", line 346, in fit self._final_estimator.fit(Xt, y, fit_params_last_step) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/ensemble/_gb.py", line 413, in fit dtype=DTYPE, multi_output=True) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/base.py", line 433, in _validate_data X, y = check_X_y(X, y, check_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(*args, *kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 878, in check_X_y estimator=estimator) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(args, kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 721, in check_array allow_nan=force_all_finite == 'allow-nan') File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 106, in _assert_all_finite msg_dtype if msg_dtype is not None else X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32').

FitFailedWarning) [CV] END .................................................... total time= 0.2s [Parallel(n_jobs=-1)]: Done 5 out of 5 | elapsed: 2.2s finished

    Training model scores with cross-validation:
    [nan nan nan nan nan]

Model trained!

Step 5/7: Creating confusion matrix... Traceback (most recent call last): File "cLASpy_T.py", line 300, in args.func(args) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/training.py", line 520, in train y_test_pred = model.predict(x_test) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/metaestimators.py", line 120, in out = lambda *args, kwargs: self.fn(obj, args, kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/pipeline.py", line 419, in predict return self.steps[-1][-1].predict(Xt, predict_params) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/ensemble/_gb.py", line 1188, in predict raw_predictions = self.decision_function(X) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/ensemble/_gb.py", line 1143, in decision_function X = check_array(X, dtype=DTYPE, order="C", accept_sparse='csr') File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 63, in inner_f return f(args, kwargs) File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 721, in check_array allow_nan=force_all_finite == 'allow-nan') File "/home/laurent/machine/cLASpy_T-Formation-CNRS/.venv/claspy_venv/lib64/python3.7/site-packages/sklearn/utils/validation.py", line 106, in _assert_all_finite msg_dtype if msg_dtype is not None else X.dtype) ValueError: Input contains NaN, infinity or a value too large for dtype('float32'). (claspy_venv) [laurent@hulk cLASpy_T-Formation-CNRS]$

TrickyPells commented 2 years ago

Thank you @laurent124 for this well documented bug report. It seems that this error comes from repetition of "'Verticality(5)','Verticality(10)','Verticality(25)','Verticality(50)" in the --features option, not from the code it self. But the addition of a fail safe to avoid this kind of error, will be a good improvement. Have fun !