SciML / Surrogates.jl

Surrogate modeling and optimization for scientific machine learning (SciML)
https://docs.sciml.ai/Surrogates/stable/
Other
328 stars 70 forks source link

Support for categorical factors and discrete numeric factors #393

Open ArnoStrouwen opened 2 years ago

ArnoStrouwen commented 2 years ago

I use Surrogates.jl quite often for chemical reaction optimization. Everything goes well, as long as only continuous process factors, such as temperatures, pressures and concentrations are involved.

However, almost invariably the question comes if we can also try out a different solvent, or ligand. Currently, I think Surrogates.jl does not support this?

Similarly even for factors that are continuous by nature it occurs often that only 5 different pressures can be practically achieved.

vikram-s-narayan commented 2 years ago

Currently, I think Surrogates.jl does not support this?

Yes that is correct. To the best of my knowledge, categorical and discrete numeric factors are not supported at present.

Similarly even for factors that are continuous by nature it occurs often that only 5 different pressures can be practically achieved.

Could you share a minimum working example or dataset to enable us to test with various surrogates? Also, if you have gradient information available, you can give GEKPLS a try. GEKPLS is intended to work with high-dimensional datasets.

ArnoStrouwen commented 2 years ago

There is no gradient, these are noisy physical experiments.

This is a nice paper that involves discrete numeric factors (and categorical if you do not use descriptors for the ligands), with data and code available: https://par.nsf.gov/servlets/purl/10231959 .

The problem in porting this to Surrogates.jl is not to get a surrogate to fit, but to perform the optimization loop that uses the information captured in the surrogate.