Open pgagarinov opened 9 years ago
1) ContSingleTubeControl should use precision properties from elltool.conf.Properties when solving ODE in ContSingleTubeControl.getTrajectory 2) What is switchSysTimeVec in ContSingleTubeControl? It is not clear why we need to specify it at all, after all all this information is available in ReachObject so we can pass this information to ContSingleTubeControl construcotr, not in getTrajectory object. 3) All properties of ContSingleTubeControl class are public, why? I think they should be private. Same in ControlVectorFunct - also a lot of public properties 4) Help headers in ControlVectorFunct and ContSingleTubeControl needs to be cleaned a bit just like I did it in ContControlBuilder. We need to pay attention to tabs. Outputs cannot be regular or not regular (they are always optional so we do not write "regular" for outputs). Also all inputs and outputs should have dimensionality and type specified in formal form. 5) getTrajectory returns a structure with is a bad practice, let us return two separate outputs if we really need to. Also names of some variables like "trajectory_time" need to be fixed. Finally, is there any way to tell what is a number of elements in trajectory_time will be in advance? A user needs to know what number of elemens to expect 6) Length of all lines in ContSingleTubeControl and ControlVectorFunct should not exceed 76 symbols. 7) Please do not use ' for transposition (like in x0Vec'), use either "transpose" or .' 8) Control synthesis classes do not know anything about tests (encapsulation!!!). This is why error messages like this
throwerror('TestFails', ['the result of test does not ',...
'correspond with theory, current scalar production is ',...
num2str(currentScalProd), ' while isX0inSet is ', num2str(isX0inSet), ';', num2str(iSwitchBack) ]);
just do not make any sense. We need to have user-friendly messages that help a user, not a test. 9) No need to use cat(1,trajectory,odeResMat) since we have []/horzcat and vertcat. 10) Why do we need to specify isX0inSet in getTrajectory? 11) getControlFunction and getProperEllTube methods in ContSingleTubeControl need to be documented 12) ContSingleTubeControl contains a description of indTube input argument that is not passed into the constructor.
all 12 requests satisfied.
about 5: the length is defined by ode45 function. but it is possible, for example, add an optional input with desired timeVec, in this case the length would be predictable
never mind
High priority:
13) Control synthesis for continuous systems for 3 configurations takes too long, an extensive optimization is required. Right now the tests for continuous systems are disabled in elltool.control.test.run_tests
14) We need to incorporate a timeout property into getTrajectory method using the approach described in http://www.google.com/url?q=http%3A%2F%2Fwww.mathworks.com%2Fmatlabcentral%2Fnewsreader%2Fview_thread%2F119565&sa=D&sntz=1&usg=AFQjCNE8SWFt1cCi-LyI37TZSJYgt9rQLw . This way we will at least know which configurations are too time consuming to test.
15) constants with values returned by methods are not constants: REL_TOL = elltool.conf.Properties.getRelTol(); ABS_TOL = elltool.conf.Properties.getAbsTol();
thus REL_TOL->relTol, ABS_TOL->absTol
Low priority:
16) currentScalProduct should not be calculated if isX0inSet==false (unnecessary calculation). We need nested IFs to solve that.
currentScalProd = dot(odeResMat(end,:).'-q1Vec,...
q1Mat\(odeResMat(end,:).'-q1Vec));
%
if (isX0inSet)&&(currentScalProd > 1 + ERR_TOL)
throwerror('TestFails', ['the result of test does '...
'not correspond with theory, current scalar ' ...
'production at the end of system switch ' ...
'interval number ', num2str(iSwitchBack), ' is '...
num2str(currentScalProd), ', while the original'...
' solvability domain actually contains x(t0)']);
end
17) There is a definite similarity between classes for continuous and discrete systems, we need to move a bunch of code into parent abstract classes
18) Discrete systems are not tested with outer points.
19) Code for discrete systems needs a great amount of cleaning, help headers are not implemented.
20) We need to get rid of hard-coded constant CRel = 10000; in C:\Users\Administrator\Documents_Git\ellipsoids\products+elltool+control\DiscSingleTubeControl.m
21) Bad names in ControlVectorFunc:
t1 = max(probTimeVec);
t0 = min(probTimeVec);
22) The following block of code needs to be refactored so that curProbDynObj is always defined (or exception is thrown). Right now curProbDynObj can be potentially undefined.
for iSwitch = length(self.probDynamicsList):-1:1
probTimeVec = ...
self.probDynamicsList{iSwitch}.getTimeVec();
if ( ( curControlTime < probTimeVec(end) ) && ...
( curControlTime >= probTimeVec(1) ) || ...
( curControlTime == tEnd) )
curProbDynObj = self.probDynamicsList{iSwitch};
curGoodDirSetObj = self.goodDirSetList{iSwitch};
t1 = max(probTimeVec);
t0 = min(probTimeVec);
break;
end
end
23) We need to add logging in control synthesis classes so that we see what is going on. Logging needs to be done via modgen.logging
logger=elltool.logging.Log4jConfigurator.getLogger; logger.info
or
logger.fatal
See Wiki for more details about logging.
24) Many tests for discrete systems fail, they have to be fixed
High priority tasks 14), 15) and low priority tasks 16) and 23) done. Also rebased to current master. Currently tests for continious systems work ~0.5 h and finish mostly with errors (and with huge amount of timeouts). I'll continue work with this issue.
Please commit your changes when your ready - I'll take a look. And please do not forget to specify issue number in your commit message like this issue #123
, otherwise GitHub won't be able to tie your commits to the task and show them here. Thanks.
I've started profiling ode calculations. First conclusions:
And also I'm not quite sure if 15 secons is enough for calculations there. Should I set a larger timeout?
Here is a way to fix it:
1) Extract Q(t) and q(t) for each tube for each span in the constructor of ControlVectorFunct and construct QArray and aMat for the whole time span [0,T] just by using built-in cat
function in Matlab.
2) Use our matrix interpolant library gras.mat.interp and call
bigQInterpObj=gras.mat.interp.MatrixInterpolantFactory.createInstance('posdef_chol',...)
to construct a matrix interpolant for QArray and
aInterpObj=gras.mat.interp.MatrixInterpolantFactory.createInstance('column',...)
to construct a vector interpolant for aMat
3) Just call bigQInterpObj.evaluate(t) and aInterpObj.evaluate in ControlVectorFunct.evaluate.
This will be much much faster because a) MatrixInterpolantFactory pre-calculates interpolant coefficients b) interpolation will be vectorized, not element wise c) there will be no this ugly code in ControlVectorFunct.evaluate for selecting a time span because both bigQInterpObj and aInterpObj will already correspond to [0,T] time span.
you can increase this timeout for now but I believe that once you fix these problems 15 seconds is going to be more than enough.
Thank you for the idea, that really speeded up the calculations - but not by orders. Still most of the tests are getting timeouts and therefore wrong answers (mostly because of Q(t) interpolation, and also because of time-consuming function "lib.modgen.common.parseparext" used by "gras.gen.MatVector" used by "gras.mat.MatrixPosReg" used by ControlVectorFunct.evaluate for getting BPBTransDynamics to calculate control bounds). However, I extracted Q(t) only from first tube (in fact, even before my refactoring only first tube was used everywhere through the ControlVectorFunct, and it's still so - doubt that it's right), but if I'd taken it from all tubes, interpolation would take even more time... Looks like I got wrong why there are several tubes. Tubes are taken in every direction in order to get approximations from their intersection and union, aren't they?
P.S. Only now I understood what was wrong with my commit descriptions (I wrote "#issue 7" instead of "issue #7")... Enclosing links to my wrongly named commits: 250c1c97f97f64254c7fae4328d98db62d9e9d23 272bab7c6bc82d67da7c9b3cfd985a1f7535f800 0a257a5a530859e16b81a8679597277a13226c30 ec31502120919e978a93e94b0f53d526c94202ec e980af630b48f1bc1067fd38d3ed0b25409723ea 58bd3eee5e42bf57ca2e06d6f4952921b3c64300
When explaining you about multiple tubes I looked at the wrong place, there is only one tube and you should always expect exactly one tube in properEllTube
. This is because a single tube from many tubes is selected prior to calling ControlVectorFunct.evaluate
and actually even prior to calling ContSingleTubeControl
constructor (this is why we have "SingleTube" as part of the name). Tube selection happens in ContControlBuilder
4) Please add a check that propEllTube.getNTuples()==1 in ControlVectorFunct constructor.
5)
In
%
function [value, isTerminal, direction] = StopByTimeout(T, Y)
TIMEOUT = 15;
value = 1;
if toc - TIMEOUT >= 0
value = 0;
self.logger.warn(sprintf(['timeout=%d reached! '...
'stopping calculations...'],TIMEOUT));
end
isTerminal = 1;
direction = 0;
end
please assing a logical value ot isTerminal i.e. true instaed of 1
also, function names should always start with a small letter - see our coding convention
finally tic;elapsedTime=toc is not safe, what is safe is:
tMarker=tic; elapsedTime=toc(tMarker);
This is because tic can be called inside of controlVectorFunct or even inside of ode45.
6) In ControlVectorFunct
indTime=nnz(ellTubeTimeVec <= curControlTime); -> indTime=sum(ellTubeTimeVec <= curControlTime);
7) Loops are extremely slow when called multiple times, I think we need to get rid of the loops inside of ControlVectorFunct.evaluate somehow. The only reason there is a loop there is to find a system to which each element of timeVec belongs to. This can be done via interp1 with methods "previous" and "next" i.e we can interpolate discrete indices. Here is an example:
interp1([1,2],[3 4],1.2,'prev')
ans =
3
interp1([1,2],[3 4],1.2,'next')
ans =
4
I hope you get the idea. Also, do we really need timeVec to be a vector. As far as I can tell `evaluate" is always called with scalar timeVec. If so - let us replace timeVec with just tVal and simplify our code. Once get rid of the loop we can take it from there. When you profile please take into account a time spent on a context switching (I believe it is marked as "self time"). Sometimes, when an amount of nested calls is high most of the time goes to a context switching between different functions.
8) If you believe that lib.modgen.common.parseparext" calls consumes a significant portion of time - please replace those calls with in-line code that does the same. The primary reasons parseparext works slow are a) a function call is always slower than in-place instruction b) parseparext sometimes uses string-base checkers like "isnumeric(x)" or "isrow(x)". These lines are not compiled by JIT compiler and are interpreted during execution which is much slower. This is not critical when you do not call parseparext multiple times but in critical places parseparext is a bad idea. For throwing exceptions please always use modgen.common.throwerror
instead of error
.
9) Timeout should be a property of ContControlBuilder
parsed via parseparext
and passed into SingleTubeContControl
constructor from where it should be passed into ControlVectorFunct
constructor. If not specified ContControlBuilder should use a default value of the timeout (default values can also be specified in parseparext
function call). All this is necessary because timeout in the tests should be significantly lower than it is needed for real-life applications. The default value should significantly large (maybe a few minutes) while the timeout specified by the tests can be around the same value you use right now.
10) I can see that the tests do not run till the end so it is impossible to see which pass and which fail:
Error: Attempt to execute SCRIPT run_disc_tests as a function: C:_JenkinsHome\workspace\EllTbx_Win64_2015b\issue_73_alim_komarov\products+elltool+control+test\run_disc_tests.m, Identifier: MATLAB:scriptNotAFunction INFO modgen.logging.EmailLogger.sendMessage - [EllTestRunner:HEAD:58bd3eee5e42bf57ca2e06d6f4952921b3c64300]:ERROR,pid:7036,conf:default,[marker:win_EllTbx_Win64_2015b/issue_73_alim_komarov_origin/issue_73_alim_komarov],gitURL:https://github.com/SystemAnalysisDpt-CMC-MSU/ellipsoids, running on host:coldfire(user:Jenkins),matlab:2015b(arch:PCWIN64)
I suspect this is because of the misprint in run_disc_tests
. Please have this fixed.
Current situation:
interp1
in ControlVectorFunct.evaluate/getIndexOfSwitchTime
(implementation of new task 7)) by some self-written analogue. Also writing and using of analogue of standart function fnval
that works faster (if that's possible) can slightly accelerate control synthesis - but still, if that could help, the test would have passed with Timeout=30, while they doesn't (except for 2 or 3 of them). Here are profile results: profile_results.zipode45
, but only in certain points. I mean passing to ode45
in ContSingleTubeControl.getTrajectory
whole vector of points instead of just [tStart tFin]
- but that requires some idea for choise of points count (if I understand correctly, everywhere through the toolbox only extreme points are got from time vector given by user). That can fasten calculations considerably.t0
and t1
are bad names (and even if they are, how can they be renamed? tStart
and tEnd
or tMin
and tMax
aren't better), and these names are well used in names of such related variables as xt1tMat
.ControlVectorFunct
constructor, only to ContSingleTubeControl
constructor, because I have time noting in ContSingleTubeControl
and don't understand why it should be moved to ControlVectorFunct
(m.b. it was a misprint?)Now I'm starting to work at tasks related to discrete systems.
However, I'm still thinking about the idea I'd mentioned several comments above: calculating control not >in all points required choosed by ode45, but only in certain points.
Not sure, how this will this help? When you pass a vector of time points ode45 still uses an adaptive step. I know that because I refactored internals of ode45 to make it ode45reg. Do you have any empirical evidence to back up your idea?
Reasons why I consider urgency of old task 21) disputable: I've not found anything in coding policy explaining why t0 and t1 are bad names
I agree, this is not urgent. But for me tStart and tEnd would be more appropriate. Let us not waste our time on that right now.
the test would have passed with Timeout=30 Is this in seconds? If so - let us increase this up to 30 seconds for the tests.
I'll take a look at your changes tomorrow.
Now I'm starting to work at tasks related to discrete systems. This seems a reasonable thing to do.
Before I can evaluate your changes I need to see the test results in any form. Right now the problem I described in 10) is not fixed and that is why the tests do not finish.
can slightly accelerate control synthesis - but still, if that could help, the test would have passed with Timeout=30, while they doesn't (except for 2 or 3 of them).
I meant that increasing Timeout in 2 times make only 2 more test pass - so any slight acceleration (less than 2 times faster) wouldn't help a bit. Moreover, I've just tested with Timeout=60 - and still 46 of 54 tests have failed (just as on Timeout=30; on Timeout=15 48 test had been failing). All failing tests except two (both related to degenerate test case ellDemo3test
, and for those error is the correct answer -- test have to be rewritten to something like assert.throws
) are failing with messages like following one:
Error: the result of test does not correspond with theory, current scalar production at the end of system switch interval number 1 is 1.8585, while the original solvability domain actually contains x(t0)
I can't say for sure that all such tests are failing because of timeout stopping calculations in their middle (m.b. this check is somehow wrong or some errors in code logic), but the number of timeout messages is tremendous (1342! I'm trying understand why only 54 tests with no more than 3 switches in each produce so many timeout messages...), so probably all 44 such tests are getting timeouts.
please make the following cosmetic changes:
11) In MatrixPosReg UNIFORM_OUTPUT->IS_UNIFORM_OUTPUT KEEP_SIZE->IS_SIZE_KEPT
12) The first input in throwerror should be a message identifier:
modgen.common.throwerror(['properEllTube must have'...
'only 1 tuple']);
->
modgen.common.throwerror('wrongInput',['properEllTube must have'...
'only 1 tuple']);
13) Please replace "timeout" name with something more meaningful like evalTimeout and update help headers in all the classes where you added the timeout as an input.
14) trajectory=[]; - bad name. There are some other incorrect names like AtMat in ContSingleTubeControl
As to the main problems:
15) Why don't we run ode45 in a loop across all switching systems instead of looping through all switching systems inside of the function passed into ode45. This should be much much more sufficient as the only thing staying in the function will be the interpolation of Q(t) and a(t). We already know solving matrix odes for finding ellipsoidal approximations is fast so I do not see any reason why this should be slower. This will require changing inputs of ContSingleTubeControl from
probDynamicsList,goodDirSetList,switchSysTimeVec
to
probDynObj,goodDirSetObj with switchSysTimeVec completely gone.
16) Now about timeout functionality. Instead of @stopByTimeout you can pass @(isTimeoutHappened,varargin)stopByTimeout(varargin{:}) where isTimeoutHappened is just an ordinary logical variable from the context initialized with false prior to ode45 call. In Matlab you can do such things (context inheritance). After ode45 call you just check a value of isTimeoutHappened and if it is true - throw an exception different from the one that is thrown right now.
Please concentrate on 15) and 16) now. I believe we need to move to discrete systems only when 15 and 16 are done and we have a clear picture of what is happening in terms of both performance and test failures.
the constructor
New tasks 10)-14) done. 15) and 16) may also be called done, but not in the way given in task description (because I didn't understand this given way):
ode45
in a loop across all switching systems in ContSingleTubeControl
, and already in function passed into ode45
(ControlVectorFunct
) there is no loops through all switching systems (or any other loops), and the main time-wasting thing really is the interpolation of Q(t) and a(t). However, two really strange things related to delegation of powers from ContSingleTubeControl
to ControlVectorFunct
were done: index of interval between swithes was calculated in ControlVectorFunct
(now it's passed from ContSingleTubeControl
), and ellTubeTimeVec
was getting from properEllTube
every time (now it's cached by setting it in constructor of ControlVectorFunct
). These changes accelerated calculations for 20-30% and are probably last essential optimizations can be done in ControlVectorFunct
(now a closer look should be taken at ContSingleTubeControl
).isTimeoutHappened
not by modifying handle to stopByTimeout
, but by looking at third output of ode45
. Anyway, now timeout is treated as error, and 53 continious tests of 54 are getting errors: 2 because matrix Q is not positively defined (and that is correct answer for such configuration, this test is still waiting for being rewritten), 15 because of wrong answers without any timeout, and 36 because of timeout. And only one test really succeeds. However, that's interesting: that means that previously 5 tests were passing even after getting timeout...14) I still see bad name like AtMat
15)
I didn't understand this given way The idea is to remove any knowledge about system switching from ControlVectorFunct i.e. I want
t0=getBeginOfSwitchTimeSpan(indSwitch);
t1=getEndOfSwitchTimeSpan(indSwitch);
curProbDynObj=self.probDynamicsList{indSwitch};
curGoodDirSetObj=self.goodDirSetList{indSwitch};
performed in ContSingleTubeControl
and I want ControlVectorFunct constructor to accept only the following inputs:
ControlVectorFunct(properEllTube,... probDynamicsObj,goodDirSetObj,inDownScaleKoeff)
and ControlVectorFunct.evaluate
- not to accept indSwitchTime i.e. it should look like
ControlVectorFunct.evaluate(self,xVec,tVal)
are probably last essential optimizations can be done in ControlVectorFunct Based on my experience what you see in the profiler results sometimes is not correct because a precision of profiling results drops very significantly when a number of calls is very high.
Let us finish the optimization by implementing the approach I suggest and then we can be sure that there is nothing we can do in ControlVectorFunct. Right now I'm not sure.
However, that's interesting: that means that previously 5-7 tests were passing even after getting >timeout... This can happen if we reach our X_0 not at T but at T*<T which is just a coincidence.
Once you are done with optimization let us concentrate on test failures for continuous systems. For me it doesn't make sense to work on discrete systems until all the problems with test failures for continuous systems are fixed.
Btw - fixing control synthesis MATH problems for discrete classes is out of scope for this task, you only need to a) fix architectural problems in both continuous and discrete classes (since they share the same code). b) fix performance problems in continous classes c) fix math problems (control synthesis algorithm) for continuous systems.
As for discrete control synthesis algorithmic problems - this is out of the scope of this task and will be done as part of #43 by Ruslan Malsagov. He will be in touch with you and will create his own branch as a copy of your branch. The he will be rebasing periodically against your branch in case you perform some architectural changes common for both discrete and continuous classes.
OK, I've got it.
14) Sorry, looks like I've missed some. Now everything should be fine.
15) Now I understood (previous description had described changes in ContSingleTubeControl
inputs, probably just a misprint). But that requires construction of new ControlVectorFunct
object for each interval between switches (however, that shouldn't be important, because there are no more than 3 switches). I'll do it, but I'm not sure whether will it make everything works faster or slower.
UPD: however, if profile results are underestimating time for getting such objects, it is more than reasonable...
Minor but very important update to make the picture complete. Please have a look C:\Users\Administrator\Documents_Git\ellipsoids\products+gras+ellapx+lreachplain\ExtEllApxBuilder.m where I generate fHandle at line 32. Please note that in fHandle that is passed into ode45 (in a super class) all objects (dynamic obj and others are already in-place i.e. they are part of the function handle). This is extremely fast and I did it so for one reason - passing in self and extracting these objects from self fields is extremely time-consuming.
Thus, here is a second part for 15) in addition to the changes already described I suggest we do the following - remove "evaluate" method in ControlVectorFunct and replacing it with getOdeFuncHandle method that will be similar to getEllApxMatrixDerivFunc in ExtEllApxBuilder. This way we will have the lowest possible overhead and my experience with solving ODEs for ellipsoidal approximation shows that performance is more than sufficient. As a proof - please run `SProp=gras.ellapx.uncertcalc.run('default');'. This will solve 3 matrix odes for both external and internal approximations (6 in total for 8 dimensional system) via ode45 in 21 seconds on my cpu.
OK, now I understood 15) completely and really completed it. Yes, that idea was really fine -- run time decreased significally, and 32 tests are failing with timeout instead of 36
Ok, this is not all - we can lower our run time even further - please have a look at my comment above.
Below (under the line) is the old task description. It is a bit out of date and not very detailed. A new much more detailed description and references to the articles wll be added within 1-2 days.
Estimated time: whole semester, this will be your only task.
The task is analogous to Issue #3 (branch Issue_74_asemenova) but for discrete systems. Issue 3 introduces classes
ContControlBuilder,Control
In this task one needs to
1) Rename Control class into ContContrTrajectoryBuilder (continuous control). 2) Create DiscrControlBuilder and DiscrControl classes having the same interface (and some common internal code located in abstract AControlBuilder and AControl classes). 3) The logic of DiscrControlBuilder and DiscrControl classes should be covered with the regression tests in exactly the same manner as it was done in Issue #3 by Anastasia Semenova.