melloddy / SparseChem

Fast and accurate machine learning models for biochemical applications.
MIT License
53 stars 11 forks source link

predict.py does not perform inverse normalization when no Y is supplied #5

Closed das22 closed 1 year ago

das22 commented 2 years ago

When running predict.py on regression/hybrid models with new compounds (e.g. suppling only a new X matrix), predict.py does not respect "inverse_normalization 1". This seems related to Ys being None causing the first part of this code to run

else:
        if args.y_class is None and args.y_regr is None:
            class_out, regr_out = sc.predict_dense(net, loader_te, dev=dev, dropout=args.dropout, progress=True, y_cat_columns=select_cat_ids)
        else:
            class_out, regr_out = sc.predict_sparse(net, loader_te, dev=dev, dropout=args.dropout, progress=True, y_cat_columns=select_cat_ids)
            if args.inverse_normalization == 1:
               regr_out = sc.inverse_normalization(regr_out, mean=np.array(stats["mean"]), variance=np.array(stats["var"]), array=True)
molden commented 2 years ago

@das22 implementing the inverse normalisation for dense predictions in branch https://github.com/melloddy/SparseChem/tree/5-predict-inverse-normalization . Would it be possible to test if it gives some sensible values?

das22 commented 2 years ago

@molden The values do indeed look reasonable! I'm no longer seeing negative numbers, the predictions reach larger values and the range is similar to my input training values.

AnsgarSchuffenhauer commented 2 years ago

@molden Are you going integrate this into master now?

AnsgarSchuffenhauer commented 2 years ago

Just out of curiosity: Wuld for the dense case the inversomalization be nut a lot easier by simply multiplying the numpy array with the 1-D numpy array of standard deviation, and the add the 1-D numpy array of means, relying on numpy's broadcasting abilities?

molden commented 2 years ago

@AnsgarSchuffenhauer yes you are right this is easier. I will have a look to optimise this before merging into master.