Closed Sigrsteinn closed 1 year ago
Hi, @Sigrsteinn,
Usually it's on par for function calls and the actual solution is a bit faster. But I guess it depends a lot on your code, since some users reported it to be faster. Someday we should have a benchmark suite here. For example, DSS_Python is typically 5x, up to 100x, faster than win32com. In .NET, DSS_Sharp is also faster since we don't need dynamic variables (which variant
usually does).
For DSS_MATLAB, we do have a lot more error checking. It you're using too many API calls and they're all checked on the MATLAB side, I imagine it might have a negative impact, especially since MATLAB is not exactly fast for non-numeric code. If that is indeed the case, I could add an alternative that does no checks. (Since MATLAB users include lots of OpenDSS beginners, it's an important feature.)
More recent versions of MATLAB have faster COM implementations too. Years ago, CallLib
calls were 10x faster, but nowadays it seems MATLAB has more optimized COM calls, it's probably already using early-bindings behind the scenes. Most information regarding this on the forums and docs is probably out-of-date -- for example, the early-binding doc is from 2015, while some of the MATLAB changes landed maybe on 2018 (not sure myself).
It didn't work because I don't know how to handle the "variant" data type in Matlab.
Yep, using variant
structures in a non-COM DLL was a bad idea IMHO. That's one of the reasons I started DSS C-API in the first place (more on https://sourceforge.net/p/electricdss/discussion/861976/thread/525c13df/).
I assumed that this package is similar in concept, and therefore faster than the ActiveX COM interface method.
At first glance, yes, it would be similar in concept to the official OpenDSSDirect library. Yet, OpenDSSDirect.py and OpenDSSDirect.jl were both migrated from the official library to DSS C-API due to several issues (many related to variant
issues, general bugs, and platform-specific bugs), some of which would still be present today. DSS C-API provides a full header with plain C interface, no variants: https://github.com/dss-extensions/dss_capi/blob/master/include/dss_capi.h
A lot of the functions my team uses (both API and internal OpenDSS code) have been optimized and we'll keep doing that for a couple more years, at least.
The faster alternative to CallLib
is using MEX code, which is cumbersome, especially since this a tiny open-source project. Besides writing a lot of code, we'd need to build the binaries for all platforms (Windows, Linux, macOS) and multiple MATLAB versions. Maybe it's a good idea to test it for benchmarking, to get an idea of how faster it could get.
Since my code uses "DSSStartup.m", I replaced the object instantiation as suggested. I didn't change anything else in my code.
I'm pretty happy at least that worked without issues 😃
When I run my code, it is considerably slower than when I used the original actxserver method. What am I doing wrong here?
In the end, it's hard to say without checking the code. If you can share a small code sample (either here or privately through email), I could try to investigate it. Depending on what we find, I could update the code in DSS_MATLAB, add additional helper functions in DSS C-API, or just propose you change some aspects of your code.
Thank you for your reply. As a noob, I've never heard of MEX. If even you as the developer of this package say that MEX is cumbersome, I'll just avoid it for now. I don't really follow Matlab's development, but I think you're right about its COM implementations getting faster. I just checked that the document about early and late binding is indeed from 2015.
This is the part of my code that uses the package.
function [Pstat,Vstat]=runDSS(obj) %this function is called 100 times
Volt=inf(obj.allc2(2),obj.countBus*3); %Preallocation
Loss=0;
for i=1:obj.allc1(1) %for i=1:22
obj.DSSText.Command=char(obj.strCmd1(i)); %2200 times
end
for hr=1:obj.allc2(2) %for hr=1:24
for k=1:obj.allc2(1) %for k=1:214
obj.DSSText.Command=char(obj.strCmd2(k,hr)); %513,600 times
end
VmagPU=obj.DSSCircuit.AllBusVmagPu;
Volt(hr,1:length(VmagPU))=VmagPU;
Loss=Loss+(obj.DSSCircuit.Losses(1)/1000);
end
VV=reshape(Volt,1,[]);
VV(VV==inf)=[];
Vmin=min(VV(VV~=0));
Vmax=max(VV);
Vdev=sum(sum((VV-1).^2));
Vstat=[Vmin Vmax Vdev];
Pstat=[Loss obj.totalPV obj.totalLD];
end
obj.strCmd1
and obj.strCmd2
are double quote string arrays containing the commands (in DSS Scripting Language).
I used these double quote strings because I need to do some vectorized string manipulation. It causes errors with single quote character vectors. The following two images show the result from the profiler. The first image shows the summary, the second image shows the details about IText.set.Commands.
As a comparison, the following images show the profiler using the actxserver method
I am using Matlab R2020a for the test. I tested it with R2020b as well with similar result.
@Sigrsteinn Interesting, thanks.
It looks like the error checking is not the main culprit. Yet, since it is still a few percent and it's a trivial function call, it's worth making it better. Since my other comment, I noticed there is a faster alternative that I forgot to implement here, I'll try that later today or tomorrow.
On the main issue, you generate a long DSS script and pass it all line-by-line, right? I think that's a bit unconventional. In those DSS commands, do you create new DSS elements or just update them?
Independent of COM or DSS_MATLAB, have you tried writing those lines to a file and then running a ...Text.Command = 'compile file.dss'
instead? I imagine it should be faster (remember to whitelist the folder where the file is created in your antivirus).
I'll try to run some tests that mimic your code -- load a large circuit in memory, feed it line by line to OpenDSS, loop that. It's still possible that your DSS script itself is hitting some slow code in DSS C-API, i.e., it might not be specific to the API calls or MATLAB. Testing should help make this clearer.
On the MEX vs. CallLib topic, this comment has some interesting data: https://github.com/CoolProp/CoolProp/issues/1095#issuecomment-224543225 (it's from 2016 though, take it with a grain of salt).
As a sidenote, someday we'll probably add a DSS.Text.Commands
(plural) to consume a huge string. We probably have a ticket/comment about it in one of the repositories under dss-extensions.
the code is about evaluating the effects of PV installations in different buses with different sizes. The common information about the circuit is already in a single script file, while the commands that I send line-by-line is just the part that changes every iteration. I also tried your suggestion of writing the lines into a .dss file and running that file.
I changed my code so there are two separate functions for the multiline DSS scripts. The first one here is my for-loop method
function script1(obj,allc,fileName,trgt)
for i=1:allc
obj.DSSText.Command=char(trgt(i));
end
end
The second one shown below is where the script is written into a file first before being executed
function script2(obj,allc,fileName,trgt)
pathScr=strcat(pwd,fileName);
fID=fopen(pathScr,'w');
fprintf(fID,'%s\n',trgt);
fclose(fID);
obj.DSSText.Command=sprintf('compile (%s)',pathScr);
end
To test the new method, I ran the the profiler for 4 codes:
At first, the result seem to show that writing the commands into a file made it slower, as shown below:
After Google Drive sync is paused and the folder is whitelisted, writing the commands into a .dss script is indeed the faster method for this package:
For the actxserver method though, sometimes the for loop method is faster. I don't know why that happens.
For the actxserver method though, sometimes the for loop method is faster. I don't know why that happens.
@Sigrsteinn It could be the profiler overhead or some background process.
I wrote a simple benchmark that loads (no solve
or calcvoltagebases
) the IEEE 8500 node circuit, here's what I got for 10 runs (it uses tic and toc to measure the total time).
Text_Set_Command
and checking for errors.times_com =
1.3613 4.0668
times_capi =
1.1041 5.1506 2.0569 1.0068
ratio_best =
0.7396
My conclusion is that MATLAB has a lot of overhead for function calls, be them COM calls or plain C calls via CallLib. I'm now considering investing some time later to create a full implementation of DSS_MATLAB using MEX to reduce the overhead. Probably won't happen for a while though (at least some weeks).
My computer at the university still has MATLAB 2018a but I requested the latest installation available and will retry this simple benchmark when I'm able. I'm not sure the prebuilt MEX files would work on other versions of MATLAB, but I uploaded my test MEX files here if you want to give them a try: dss_capi_mex_test.zip. To test them, drop them inside the +DSS_MATLAB
folder (which contains the dss_capi_v7.dll
). Instead of DSSText.Command = cmd;
, use DSS_MATLAB.dss_text_set_command(cmd)
. For the C loop version, DSS_MATLAB.dss_text_set_command(cmds)
. With a bit of luck, it could work, especially since very few MATLAB functions are used in the code.
I decided to check most options to get a more clear overall of the speed of each.
On the table (running on MATLAB 2018a):
actxserver
. No extra error checking.calllib
. No extra error checking.calllib
.calllib
calls directly, no extra error checking.calllib
. Error checking is included here.COM | DirectDLL calllib | DSS_MATLAB | DSS_MATLAB vs COM | DSS C-API calllib | DSS C-API calllib vs COM | DSS C-API MEX | DSS C-API MEX vs COM | |
---|---|---|---|---|---|---|---|---|
Run script in a single .DSS file | 1.3636 | 1.2941 | 1.1537 | 85% | 1.0884 | 80% | - | - |
Run script line-by-line (from MATLAB) | 4.0192 | 2.0579 | 5.1038 | 127% | 1.8869 | 47% | - | - |
Read dssCircuit.AllBusVolts | 0.8856 | - | 0.1365 | 15% | - | - | - | - |
Read dssLoads.Name (all loads, one by one) | 0.2259 | 0.1255 | 0.1483 | 66% | 0.1136 | 50% | - | - |
Write dssLoads.Name | 0.3632 | 0.3026 | 0.6242 | 172% | 0.1063 | 29% | - | - |
Iterate through loads | 0.0761 | 0.0379 | 0.0508 | 67% | 0.0373 | 49% | - | - |
Read dssLoads.kW (all loads, one by one) | 0.1732 | 0.0824 | 0.1714 | 99% | 0.0791 | 46% | - | - |
Set all dssLoads.kW | 0.2119 | 0.0934 | 0.5830 | 275% | 0.0864 | 41% | - | - |
Set single dssLoads.kW | 0.0078 | 0.0045 | 0.0414 | 531% | 0.0041 | 53% | - | - |
Set single dssLoads.kW (using MEX) | - | - | - | - | - | - | 0.0028 | 36% |
Run script line-by-line (using MEX in a MATLAB loop) | - | - | - | - | - | - | 2.0627 | 51% |
Run script line-by-line (using a C loop in MEX) | - | - | - | - | - | - | 0.9910 | 25% |
Some observations:
dssCircuit.AllBusVolts
, you can see that DSS_MATLAB can help a lot in some scenarios.For the time being, my recommendation would be to use DSS_MATLAB, profile (like you already did, @Sigrsteinn), and try to use some calllib
on the hot loops. E.g., provided you already instantiated DSS_MATLAB, you can replace
DSS.Text.Command = char(cmd);
with
calllib('dss_capi_v7', 'Text_Set_Command', char(cmd));
This will not have the extra error checking but should be the faster solution besides MEX. For simple calls (single char array, integer, or double arguments) this is feasible. For functions that return arrays/pointers, it becomes more cumbersome. Most functions names are straightforward and you can check them in the matching header file for the DSS C-API version, e.g. https://github.com/dss-extensions/dss_capi/blob/0.10.6/include/v7/dss_capi.h -- I'll make sure to include this in the future releases.
As I mentioned in my previous message, I'll redo this on the latest MATLAB whenever I get it, and in the future investigate MEX further -- the MEX C++ alternative in MATLAB 2020a looks much better than the C alternative. My plan would be to keep the current calllib
implementation as it doesn't require the users to compile anything, but complement it with MEX alternatives if the users need faster code and have a compiler installed.
I'm not sure the prebuilt MEX files would work on other versions of MATLAB, but I uploaded my test MEX files here if you want to give them a try: dss_capi_mex_test.zip.
I just the MEX files. The DSS_MATLAB.dss_text_set_command(cmd)
method works, but as you said, it is rather slow. In my case, it's somehow slower than Matlab's actxserver
. The DSS_MATLAB.dss_text_set_commands(cmds)
method does not work for me. It causes an access violation that forces Matlab to close.
For the time being, my recommendation would be to use DSS_MATLAB, profile (like you already did, @Sigrsteinn), and try to use some
calllib
on the hot loops.
I also did this. It is faster than the line-by-line MEX some of the time, but the calllib
method is still slower than actxserver
on my machine.
I don't know what's wrong. Maybe there's something wrong with my PC, maybe I need to run more iterations to get more accurate result for each method. But for now, I need to focus on something else. I will probably try this again some time in the future.
Thank you for your help.
@Sigrsteinn If your timings always come from the profiler, try running with the profiler disabled. With profile on
:
Thus, when using the profiler, the timings are skewed a lot towards COM. A simple tic + toc should not affect the timings though -- use that instead and DSS_MATLAB should be faster.
Performance across multiple MATLAB versions vary widely, see this for an overview: https://stackoverflow.com/a/1745686
Many DSS_MATLAB properties are faster, some are slower.
Lines.Name
takes 30% of the time in DSS_MATLAB, but writing takes 50% more time than the COM implementation.DSSObj.ActiveCircuit.AllBusVolts
, are also faster, taking around 50% of the time. This is true even when our (API extension) DSSObj.AdvancedTypes = 1
, that is, when the plain array is converted to a complex number array.Running a compile
or redirect
is around 40% faster for the test circuit ckt5. Of course if depends on the system, but usually you can expect better performance with DSS_MATLAB.
A solve mode=daily
is typically faster too.
All of these is considering that we add more error checking.
So, in the end, you'll probably end up with faster scripts in certain conditions, or slower if you hit the slower properties. It all depends on how you wrote the script in the first place. Looking for MATLAB code that uses OpenDSS, there are some very bad examples that instead of using the dedicated API functions, just use the Text interface, only to then post-process the Text.Result
string into numeric values. For example, there is no need to use the Text interface to get the buses of all lines since you can use DSSObj.ActiveCircuit.Lines.Bus1
(and Bus2
), or more generally DSSObj.ActiveCircuit.ActiveCktElement.BusNames
.
Regardless of the interface being used, the general advice for any language is still valid: try to avoid strings when possible and you'll get results faster. For example, consider that strings are generally arrays of characters of varying sizes, so if you activate lines by name in a loop, you're copying all those strings around, including potential reencoding, etc., while if you activate lines by index, a single integer is being copied.
And again, regardless of the interface, it doesn't seem like the users quite understand how the classic OpenDSS API organization works. I guess that's something we can try to help though, by adding an overview document.
Per #13, we still want to add a MEX option, but it certainly is not because this package "runs slower than actxserver".
Hello, I am sorry if this is a stupid question, but: I need to run OpenDSS with Matlab. I profiled my code with Matlab, most of the time is spent on sending the commands through OpenDSS COM interface. I read up about early binding and late binding in OpenDSS Documentation. I tried the early binding method with DCSL as seen in the documentation to speed up my code. It didn't work because I don't know how to handle the "variant" data type in Matlab.
That's when I found this Matlab package on GitHub. I see that the code in this package is somewhat similar to the DCSL method, particularly the "calllib(libname,command,arg)". I assumed that this package is similar in concept, and therefore faster than the ActiveX COM interface method. I read the usage guide. Since my code uses "DSSStartup.m", I replaced the object instantiation as suggested. I didn't change anything else in my code.
When I run my code, it is considerably slower than when I used the original actxserver method. What am I doing wrong here?