volkamerlab / teachopencadd

TeachOpenCADD: a teaching platform for computer-aided drug design (CADD) using open source packages and data
https://projects.volkamerlab.org/teachopencadd
Creative Commons Attribution 4.0 International
713 stars 197 forks source link

dev branch: T007 does not execute #360

Closed mbackenkoehler closed 1 year ago

mbackenkoehler commented 1 year ago

The last code cell results in this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: float() argument must be a string or a number, not 'UIntSparseIntVect'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[26], line 10
      8 print("\n=======")
      9 print(model["label"])
---> 10 crossvalidation(model["model"], compound_df, n_folds=N_FOLDS)

Cell In[21], line 42, in crossvalidation(ml_model, df, n_folds, verbose)
     39 train_y = df.iloc[train_index].active.tolist()
     41 # Fit the model
---> 42 fold_model.fit(train_x, train_y)
     44 # Testing
     45 
     46 # Convert the fingerprint and the label to a list
     47 test_x = df.iloc[test_index].fp.tolist()

File ~/.miniconda3/envs/teachopencadd/lib/python3.9/site-packages/sklearn/ensemble/_forest.py:345, in BaseForest.fit(self, X, y, sample_weight)
    343 if issparse(y):
    344     raise ValueError("sparse multilabel-indicator for y is not supported.")
--> 345 X, y = self._validate_data(
    346     X, y, multi_output=True, accept_sparse="csc", dtype=DTYPE
    347 )
    348 if sample_weight is not None:
    349     sample_weight = _check_sample_weight(sample_weight, X)

File ~/.miniconda3/envs/teachopencadd/lib/python3.9/site-packages/sklearn/base.py:584, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)
    582         y = check_array(y, input_name="y", **check_y_params)
    583     else:
--> 584         X, y = check_X_y(X, y, **check_params)
    585     out = X, y
    587 if not no_val_X and check_params.get("ensure_2d", True):

File ~/.miniconda3/envs/teachopencadd/lib/python3.9/site-packages/sklearn/utils/validation.py:1106, in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
   1101         estimator_name = _check_estimator_name(estimator)
   1102     raise ValueError(
   1103         f"{estimator_name} requires y to be passed, but the target y is None"
   1104     )
-> 1106 X = check_array(
   1107     X,
   1108     accept_sparse=accept_sparse,
   1109     accept_large_sparse=accept_large_sparse,
   1110     dtype=dtype,
   1111     order=order,
   1112     copy=copy,
   1113     force_all_finite=force_all_finite,
   1114     ensure_2d=ensure_2d,
   1115     allow_nd=allow_nd,
   1116     ensure_min_samples=ensure_min_samples,
   1117     ensure_min_features=ensure_min_features,
   1118     estimator=estimator,
   1119     input_name="X",
   1120 )
   1122 y = _check_y(y, multi_output=multi_output, y_numeric=y_numeric, estimator=estimator)
   1124 check_consistent_length(X, y)

File ~/.miniconda3/envs/teachopencadd/lib/python3.9/site-packages/sklearn/utils/validation.py:879, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
    877         array = xp.astype(array, dtype, copy=False)
    878     else:
--> 879         array = _asarray_with_order(array, order=order, dtype=dtype, xp=xp)
    880 except ComplexWarning as complex_warning:
    881     raise ValueError(
    882         "Complex data not supported\n{}\n".format(array)
    883     ) from complex_warning

File ~/.miniconda3/envs/teachopencadd/lib/python3.9/site-packages/sklearn/utils/_array_api.py:185, in _asarray_with_order(array, dtype, order, copy, xp)
    182     xp, _ = get_namespace(array)
    183 if xp.__name__ in {"numpy", "numpy.array_api"}:
    184     # Use NumPy API to support order
--> 185     array = numpy.asarray(array, order=order, dtype=dtype)
    186     return xp.asarray(array, copy=copy)
    187 else:

ValueError: setting an array element with a sequence.

This is likely due to the changes introduced by the change in FP featurization.

@hamzaibrahim21 I think your execution environment was an older version. The dataframe outputs were not matching anymore. Maybe you need to re-install the environment (remove the old one, install the latest one on the dev branch).

hamzaibrahim21 commented 1 year ago

I made a change in the fingerprint generator function. It should work now :+1:

mbackenkoehler commented 1 year ago

Can you please make a PR?

hamzaibrahim21 commented 1 year ago

Sure, here you can find it #361