Evaluating properties based on an array of input parameters

CalebBell / thermo

Thermodynamics and Phase Equilibrium component of Chemical Engineering Design Library (ChEDL)

MIT License

594 stars 114 forks source link

Evaluating properties based on an array of input parameters #34

Closed ma-sadeghi closed 2 years ago

ma-sadeghi commented 4 years ago

For example, if I have a large numpy array of temperature values (say a million-element long), other than calling thermo in a for loop, is there a vectorized way to extract properties from thermo?

Many thanks, Amin

CalebBell commented 4 years ago

Hi Amin,

There is nothing quite like that in thermo today. In general, properties can depend on each other in weird and confusing ways; they tend to be serial to evaluate, with lots of if statements.

However, if you are looking for more speed, it is quite possible to call methods directly; you can see how calculations are made with a debugger; or with the MixtureProperty, TDependentProperty, TPDependentProperty classes. By calling directly into their mixture_property, calculate, or calculate_P methods respectively, you can avoid calculating properties which you do not need.

I might be able to advice better if there is a specific scenario here.

Sincerely, Caleb

ma-sadeghi commented 4 years ago

Hi @CalebBell,

Thank you for your reply. So, basically, we're looking for a reliable and fast package to generate physical properties as input for our library, OpenPNM, which does pore-scale simulation of transport phenomena in porous materials. In a typical simulation scenario in OpenPNM, we are interested in evaluating properties such as density, diffusivity, etc. at each computational node (which could range from 10^1 to 10^8).

So, for instance, suppose we have N computational nodes. If density is a function of temperature and pressure, we'd have two N-element-long numpy arrays corresponding to the temperature and pressure of individual nodes. We're interested in generating an N-element-long numpy array for density, such that each element corresponds to a temperature-pressure pair.

Best, Amin

CalebBell commented 4 years ago

Hi Amin,

My experience with CFD suggests tabular interpolation is the way to go. Calculating these properties will definitely be one of the slowest steps, and applying simplifications is normal - calling thermo directly would be too slow no matter what happened.

So you might make a lookup table of say T, P, and composition; and fill it out with values from thermo. Then during the simulation, interpolate with the nearest values. Some of numpy's interpolation routines might be suitable but more likely would also be too slow.

Sincerely, Caleb

alexchandel commented 4 years ago

@ma-sadeghi If it's small enough, you could theoretically try to locally modify the portion of thermo you want to use to support numpy's arrays, which would provide you with instant vectorization support. And/or to use scipy's solution routines. Numpy & scipy provide significant performance boosts.

CalebBell commented 2 years ago

Hi,

I added an example of using multiprocessing to speed up multiple phase equilibria calculations: https://thermo.readthedocs.io/Examples/Performing%20Large%20Numbers%20of%20Calculations%20with%20Thermo%20in%20Parallel.html I hope this is helpful. I don't have any plans to vectorize calculations further or add a convenience wrapper which allows numpy arrays as inputs. I don't know what broadcasting rules would be best for that sort of an interface, and I don't think using a loop is too onerous on users. The numba backend will hopefully continue to be developed for those who want to use numba's parallelization features. I can try to answer specific questions about parallelism here if you have questions.

Sincerely, Caleb