Open AdeL22222 opened 2 years ago
Can you please make a standalone example that demonstrates this issue? Otherwise it is difficult to know where to look.
I provide 2 test packages (created on Linux), each containing the sources + a test program and + a shell script that builds the test program using intel ifort compiler. At the end of the build, the test program runs 900 cycles of hsflsh, and the program is timed using the Linux system command time. On my machine, test_for_NIST_Refprop9 is timed real 0m5.639s user 0m5.578s sys 0m0.008s test_for_NIST_Refprop10 is timed real 7m2.423s user 7m1.366s sys 0m0.019s I provide 2 test packages (created on Linux), each containing the sources + a test program and + a shell script that builds the test program using intel ifort compiler. At the end of the build, the test program runs 900 cycles of hsflsh, and the program is timed using the Linux system command time. On my machine, test_for_NIST_Refprop9 is timed real 0m5.639s user 0m5.578s sys 0m0.008s test_for_NIST_Refprop10 is timed real 7m2.423s user 7m1.366s sys 0m0.019s test_refprop9.zip test_refprop10.zip
Have you tried to enable the saturation spline?
Are the numerical results different?
Also, it seems you are not doing a release build?
Hello Ian, a) I did not try saturation spline; I used a (fortran) design program that worked well when being liked against the objects of the the refprop 9 sources and exchanged the refprop fortran codes for the version 10 fortran files I was provided with. Since all funcitons required were available, I did not change the calling modes or sequence of functions being called. b) I did not explicitly compare the results. But the results must be almost the same: if a complext turbomachinery iteration with internal convergence limits of 1E-8 requires with both sources precisely the same number of calls, 37thousands and something, the differences must be of the order of 1E-08. c) What are you referring to with "not doing a release build"? I recieved the sources from my collegues responsible for the commercial side (license fees). To generate Linux object files, and for consitency with our internal software development environments. I converted dos-to-unix (line endings) and may have switched the case of the filenames. Otherwise, the content of the files is as provided to me. I understand that the sources I am using are not release sources (hence likely some intermediate developments status?), so I'll double back with my colleagues that they should contact you / NIST to be provided with the latest official fortran sources. Regards Armin
Calling satspln ahead of hsflsh does not significantly change performance. New timing result from 7m2sec down to 6m34sec, while Refprop 9 has 5sec. real 6m34.564s user 6m33.569s sys 0m0.008s
You are not compiling the source files with optimization enabled. Without that, timing comparisons are meaningless.
Agreed - though I cannot imagine that full optimization makes a factor of 100 in the one case and nothing in the other; but I'll let you know. Anyhow, my fortran code is Ver 10.0000 and this seems not to be the latest - windows executable refers to 10.0097. Since I am only the "application developer": what is the correct way to obtain the 10.0097 fortran sources from NIST?
First let's try 10.0 with proper optimization, although I agree that is not likely to be the culprit. If that causes trouble, then we can look into the development versions.
I ran optimization with options -O3 -ip ("-ip" for ifort = unlimited interprocedural optimization) Refprop 9 w/o prior satspln call (within the limitations of the "time" command, the satspln call did not change the execution time) real 0m5.171s user 0m5.108s sys 0m0.010s Refprop 10 with prior satspln call real 6m34.626s user 6m33.490s sys 0m0.129s
And do you get the same results, and number of failures with refprop 9 and 10?
No calling errors within the 900 test samples, maximum relative deviations, Abs(refprop10- refprop9)/refprop10, of the 900 samples by property
t p d q e cv cp w Max D_rel 6.0412E-05 2.2272E-04 1.5875E-04 0.0000E+00 5.9300E-05 9.8168E-05 8.6504E-05 4.8152E-05
Which is more correct in terms of the h(T,d) values after the HS iteration? I noticed that there were some issues with iteration tolerances in REFPROP 10.
Procedure: HSFLSH(h,s) => T,d, with these TDFLSH(T,d) => h1 Refprop 9: max relative enthalpy error (h1-h)/h within test samples 0.00000000993398 Refprop 10: max relative enthalpy error (h1-h)/h within test samples 0.00000000000982
Please note that email responses to the thread have the message history. Either remove the history or response in the web interface.
Using Refprop 10.0098 sources did not change the calculation behavior. Same precision, same calculation time.
I am experiencing a similar issue with very long computational time on HSFLSH function calls. I am calling REFPRP64.dll from within a C# application to get the properties of various hydrocarbon gas blends. The HSFLSH call is embedded within an iterative loop, so the runtime penalty is very annoying.
Observations For the same gas mixture and physical conditions, I measure the following runtimes: TPFLSH = 0.11 seconds PHFLSH = 0.08 seconds HSFLSH = 47.831 seconds
Steps to Reproduce Gas components (mol frac) = 0.001 H2O, 0.002 N2, 0.02 CO2, 0.925 C1, 0.035 C2, 0.006 C3, 0.003 C4I, 0.001 C4, 0.001 C5I, 0.001 C5, 0.001 C6, 0.002 C7, 0.001 C8, 0.001 C9 Pressure = 700 PSIA Temperature = 112 degF Enthalpy = 360.18 BTU/lbm Entropy = 1.045 BTU/lbm-R
System Info Version: Refprop 10 Operating System: Windows 11 Access Method: dll import into C# .NET application
Is there any way you can rewrite your code to not use HS calls? They will always be the most computationally expensive because the shape of curves in H-S coordinates do not lend themselves to easy iteration, and neither is an independent variable of the iteration, and as a result, nested iterations are required.
Unfortunately, avoiding HS calls is sometimes not possible: E.g. in turbomachinery analysis, there are performance modeling concepts that estimate performance loss in terms of entropy production at a given flow condition; that is, the innermost thermal iteration loop is based upon evaluating properties as a function of enthalpy and (correlation based iteratively changing) entropy. Also, in fluid mechanics, conversion from static and stagnation properties requires using the HS functionality - and particularly in turbomachinery applications, these conversions occur whenever changing the frame of reference (stationary blade row to rotating blade row), that is, at least in the 2nd to innermost iteration loop.
However, one could potentially provide an estimate for where the search of the HS-Flash function should start iterating (either stemming from the previous design iteration or some other considerations). Could this be an option to speed up the search? From my background of turbomachinery analysis, providing an optional argument / optional arguments for "initial" p and/or T for the HS search could be possible.
If you know your inputs are single-phase, you could do the iteration yourself with Newton-Raphson. You would just need the analytical Jacobian (trivial to generate with REFPROP via CoolProp because CoolProp implements all single-phase derivatives). And then you would start close to the right answer too, so the iterations should be reliable.
Ian, thanks for that suggestion, I will give it a try. My use case is turbomachinery analysis like AdeL mentioned. The slow HS calls are occurring only at specific pressure, temperature, gas mixture combinations, which led me to believe there was likely something happening in an unstable region of the iterations. Other times, the HS calls are hardly any slower than TP or PH calls.
I looked a bit into HS calls, in a slightly different context: https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=3228&context=iracc . Ultimately we plan to add something similar to REFPROP, but don't hold your breath.
Can you explain the context in which you have H,S inputs? Would help me motivate work along those lines.
Hello Ian,
a. static => total:
a. calculating the entropy increase due to losses (e.g. during expansion across a turbine blade row or compression in a compressor blade row). b. calculating the resulting thermodynamics change of state across the blade row as a function of conserved stagnation enthalpy/rothalpy and "loss-increased" entropy. Hence every calculation of the exit conditions of the row starts with determination of exhaust properties (typically pressure) as function of conserved stagnation enthalpy/rothalpy and inlet entropy + loss-driven entropy increase. c. Static / stagnation property conversions (point 1) are required on inlet and exhaust side as well.
Kind regards
Armin
Thanks for your response (it is on the web interface too, since you responded to the email). It helps to motivate why we really have to get the H,S inputs working properly and efficiently
Description
Performence decay when switching to Refprop 10 Fortran Sources: Increase of computation time by factor of >100 Performance decay due to performanc issue in routine hsflash
Steps to Reproduce
Composition (mole fractions) oxygen 0.206480E+00 nitrogen 0.770311E+00 co2 0.324970E-03 water 0.138509E-01 argon 0.903313E-02
Default mixing, default reference conditions. Iterative evaluation with parameters within the Temperature range T=300K-450K, and pressure range p=100-400kPa (expansion in starting at ~450K, 3.5bar to ambient). List of calling arguments of the 31717 calls to hsflash can be provided, if required.
Expected behavior:
Performance comparable to Refprop 9: Reference profiling results, extract from callgraph
Actual behavior: [What actually happens]
Profiling results of Refprop 10 Code (gprof), extract from callgraph.
As can be seen, the time spent in hsflsh increases (with the same number of calls) from 29.33s to 13348.5s. That is, computation time increases by a factor of ~500. Computation time primarily due to computation time increase of children, thereof primarily the calls to sath, though the number of calls of sath is constant. Amongst the children of sath, the drasitically increasesed number of calls to satt from 54 to 1649286 is striking. But also the number of calls to ispure within hsflsh increased from 1.4 Million to 25Million by a factor >10.
Gprof explanation of table Each entry in this table consists of several lines. The line with the index number at the left hand margin lists the current function. The lines above it list the functions that called this function, and the lines below it list the functions this one called. This line lists: index A unique number given to each element of the table. Index numbers are sorted numerically. The index number is printed next to every function name so it is easier to look up where the function is in the table.
For the function's parents, the fields have the following meanings:
If the parents of the function cannot be determined, the word
<spontaneous>' is printed in the
name' field, and all the other fields are blank.For the function's children, the fields have the following meanings:
If there are any cycles (circles) in the call graph, there is an entry for the cycle-as-a-whole. This entry shows who called the cycle (as parents) and the members of the cycle (as children.) The `+' recursive calls entry shows the number of function calls that were internal to the cycle, and the calls entry for each member shows, for that member, how many times it was called from other members of the cycle.
Versions
REFPROP Version: Refprop 10 Fortran Sources vs Refprop 9 Fortran Sources Operating System and Version: Linux 3.10.0-1160.53.1.el7.x86_64 #1 SMP Fri Jan 14 13:59:45 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux Access Method:
Compilation of Fortran source code and linking against proprietary code using intel ifort2021.
Additional Information
If possible, please post examples and/or screenshots of the issue.