Closed christianurich closed 11 years ago
Ok i guess those are components which need to be taken from a predecessor state. So in each getComponent all attributes get copied - that shouldn't be the case anymore, since https://github.com/iut-ibk/DynaMind/commit/90d6fd1f77503daa7297064d245cb4c96e2c8783 I will take a closer look.
this improves the performance for read only views: https://github.com/iut-ibk/DynaMind/commit/c1376ea276045a0395c935d2f728a5ca903bea09
I removed some stuff from the module but it is still pretty slow. At least we get now 5 elements per second. It takes around 40 minutes to access 11.000 elements and read one vale and write one.
Model test_db_bigger.dyn? If not, please provide a test case.
Was for the big one. Tried our default simulation (name has change, see unstable branch for the file) Data/Simulations/sandbox_drainage_system_with_infiltration.dyn --nodecache 500000 --attributecache 500000 --loglevel 0
Same result (compiled as release) as soon as the db kicks in
INFO Wed Apr 24 18:32:55 2013| Start AttributeCalculator Impervious {8a536167-f8d0-40c8-b647-07393c341646} Counter 0 DEBUG Wed Apr 24 18:32:55 2013| 1636 / 1 DEBUG Wed Apr 24 18:32:55 2013| 1636 / 2 DEBUG Wed Apr 24 18:32:55 2013| 1636 / 3 DEBUG Wed Apr 24 18:32:55 2013| 1636 / 4 DEBUG Wed Apr 24 18:32:56 2013| 1636 / 5 DEBUG Wed Apr 24 18:32:56 2013| 1636 / 6 DEBUG Wed Apr 24 18:32:56 2013| 1636 / 7 DEBUG Wed Apr 24 18:32:56 2013| 1636 / 8 DEBUG Wed Apr 24 18:32:56 2013| 1636 / 9
I've improved the code of attribute calculator - readability and function timing. Please optimize modules, if they are running slow. Reducing function count can improve the overall eprformance greatly (there was a part in AttributeCalculator which was able optimize from 1+N*log(N) calls to 1 call!).
Well the basic issue is not the "slow" database or caching, it's the concept of links, which lead to elements outside the defined view. You will have to access the elements without links, or via a read only view - if you want to have higher performance. I've discussed this with michael too, it's a design issue (of links), which will be addressed once simenv part 2 starts.
For now, i close this issue. It's not about Cache performance nor about the DB.
That is not true, have really tested the issue? I get the same behaviour also in other modules as soon as the db kicks in where I just use defined views and no links.
The read only view doesn't help. As far as I understand if I use getComponent a successor state is created.
And be careful at the moment the access type of the view relates to the access type of the geometry (as the docu says). Which means that for the edge only the start and endnode are not changed. It's still possible to change attributes.
If no links are used, the yesterdays core got an improvement, as commented before https://github.com/iut-ibk/DynaMind/commit/c1376ea276045a0395c935d2f728a5ca903bea09. For read only views.
If it's not read only, it will copy the node. If it's not fitting in the cache, it's copied from/to hard drive. getComponent can't decide wheter it's meant read only or not, there is no information on that.
If read, only refers to the geometry, i will undo the chances i've made - because a component can't change it's owning system. And that shouldn't ever be possible. Therefore a read only access is not possible under these circumstances.
Memo: Yes it's true. Yes i'm testing, I'm testing for more then 2 weeks on not existing dead locks, improving and tweaking non-core module code and trying to develop things, which should be addressed at simenv part 2. Instead of working on simenv part 1, which should be my current task since 3 weeks.
Updates on the actual increase or the actual performance would help in the discussion.
The problem is that when I write in the DB and the cache is full the performance is 5 components per second (as posted 2 times now). Maybe my system set up is messed up. But I don't have any other number to compare it with. Please update me on the performance. Is it just a handful of elements per second when the cache is full and the 'swapping' starts?
The access of the views can be checked with the class DataValidation (dmdatavalidation.h)
The performance increases on code improvement cannot be generalized, most of them are trivial, minior or very case specific. The read only option (when including read only on attributes) could improve performance, case deptendend, up to a factor of 100, as analysis shows.
In the specific case of your provided simulation, the recursive gathering of (linked) components leads to a massive copy-wall as not all components needed fit into the cache. In the copy process most performance is taken away by copying attributes (rw on db) - thats why it takes 0.2 seconds per component. If you want a number to compare, just increase the cache to a level swapping does not take place. 500k seems AFAIK low anyway.
Changing getAccesssType to
view.getWriteAttributes().size() == 0 && view.getAccessType() == READ
should do it?
But it won't help in your model case, as components from links are not part of the getComponentsOfView, and a single component, returned via getComponent, does not know whether it's read only or not - so that wont help either. A possible improvement would be a getComponentReadOnly - but that will change API, increase complexity, break the idea of a self-handling core (as you pointed out in the beginning) and may conflict with later ideas in simenv part 2 (enhanced views).
I know there is a lot of improvement in the code to do and also links are maybe a problem. For me the really important number are the 0.2 components per seconds.
So for big simulations increasing the cache is not an option anymore. So I'll have to use the DB. States still fit in the ram. (for the test file 500k should be more than enough to fit one state and otherwise we are not testing the db)
So writing 10.000 elements just takes a lot of time when the cache is full, for my simulations 40 minutes
Don't get me wrong but all the other improvements don't really matter in this case. We just shift the problem.
I would like to know if
A) this is really the case (slow writing) or is just my computer, is it much faster on your computer, if so why? B) when slow what can we do something about it?
I may have a fix for it, not really beautiful, but it will do it with some internal changes. I'm publishing it when it's stable.
Big simulations will run on DB - thats the whole point of my core. But one can avoid it through well done modules - on cost of "less granulated" modules.
All other improvements will help us in future, in particular other simulations than yours. It's not only about improving the performance of your specific simulation. There is no shift.
a) Our machines are fairly different, while your hard disc may be fast due to SSD tech, my single core performance is always ahead. But i can't really tell, i'm switching constantly between debug, release, release with minimal debug info and profiling (debug with no external debug thread). And i'm working on a linux machine via remote access, for further testings - as most of the profilings take a long time - with 8 threads.
b) reduce functions calls, which lead to unnecessary cache access, enlarge cache so all module data fit into the cache (btw thats a crucial point, if thing do not fit in ram, one has to process chunks/blocks - later for simenv part 2,3,4....)
I know all that.
I just want to know
How many components can I write per sec when the cache is full and I work with states? Is it a 1.000 per sec or just a few?
As written before, the component itself doesn't cost that much, again, the number of attributes is significant. And attributes are not of the same type, a time series will take much longer than a double. The difference is huge, it also depends on how big the current db file is, how big the cache is, in first place, etc etc So it's impossible to provide a number per component.
Again I go over 10.000 components and do nothing else than to write one attribute per component how long does it take. Seconds, minutes or hours.
If it depends on the number of attributes make an example for me.
double attributes on one component:
1k 3ms 10k 1488 ms 100k 17260ms
Is this with successor state?
no, bare writing into db, sketch:
// set cache cfg: infinite to all, except attributecache to 3
DM::System sys;
DM::Component c;
sys.AddComponent(&c);
for(int i 0 to 10003)
c.addAttribute(string(i), i);
in the newst core, the cache unit test has an outcommented part at the end - thats basically the test.
That are good results. So my loop with states in the simulation is 2.000 times slower (not sure if I have done it right in my head ) I'll put together a standard test set for performance testing tomorrow so we can focus on this and maybe we can identify what the problem is.
The problem is the only read pointer, i uploaded a new core where the successors are optimized - as said it's a bit tricky. I'm testing it right now. Why don't you use a bigger cache for now, we can deal with that later on.
I'm out of ram. But anyway I'll put together a test set that just focuses on the db and states so we can compare results, test the new solutions and talk about the same issues.
Yeah, but keep in mind, that i'm off the whole next week.
I created a simple performance test. At the moment without states. If the results are right (please check the code) the problem is the read access detailed results are here (https://github.com/iut-ibk/DynaMind-ToolBox/wiki/Performance-Test-DB)
Write access is constant and not to bad. In average 0.25ms per element (similar to your results with a little bit a different test)
The read access is getting worse with the size of the db and is in all cases much slower than the read access
for 10.000 6 ms per element for 100.000 119 ms per element for 1.000.000 1138 ms per element
If you could confirm that I didn't make a mistake in the code I would make a simular test for the states.
Other minor things
I reviewed the code, it's not a bad test, but the setting causes a very bad db behaviour. Please use well defined numbers, and please read the cache settings docu - Esspecially regarding
cfgNew.queryStackSize = 1234;
cfgNew.cacheBlockwritingSize = 1234;
The first option will greatly increase RAM usage, the second burn down the db performance and works just because of a safty net.
I will improve the code.
I just copied the settings from your unit test ;-) and I don't fully understand what the settings are (and don't really need to know).
Yeah please choose better setting and maybe it helps us with the performance. Could you stick to the old API with the getUUIDs for this test. We could also make one with the new API to compare the speed up and also that I have an example of how to use it.
Please read docu https://github.com/iut-ibk/DynaMind-ToolBox/wiki/Cache-settings-and-flags But anyway, it won't help us, because read and write into DB can't be influenced by us (by now). The question is how to minimize db access (and cache usage), thats why i'm talking about optimizing modules and functions, reducing function calls.
Once again: DB access can't be faster then now, the driver and hardware gives us the performance. What you are doing is a benchmark on your personal hardware system. Thats why there is a profiling test on cache in the unit test, but no db-access test (because it's hardware dependend and can't be influenced by code (just a few ppm, by now)). That's why i closed the issue, because the Cache -> DB i/o performance (per element) can't be changed (i think i also mentioned that above).
I'm going to write a larger article about the whole issue, to make clear how core, cache and db works. Meanwhile, skip the test, and read the current docu on cache.
PS: i may got some improvements on successor states, stay tuned. PPS: if there are questions, no big deal. But we should discuss them per email, this threads about Cache->DB performance :-)
Btw Components are huge, as i can't cache everything the structure itself takes a lot of space. In particular node-edge maps. Thats why a simulation takes a lot of RAM, even with small cache sizes.
I was just surprised how much it really is. But that's probably an issue to address in version 0.7 or 0.8. If it makes a problem.
check it out with sizeof( x ) - returns the size in bytes:
sizeof(DM::Component) 68
sizeof(DM::Node) 76 (+ 24 in cache)
sizeof(DM::Edge) 80
sizeof(DM::Face) 104 + 4*nodecount
sizeof(DM::RasterData) 152 (+ a lot in cache)
sizeof(DM::System) 264
sizeof(DM::Attribute) 72 + (min 4 in cache, up to thousands)
I have done some testing of the DB in the last days.
Rasterdata works awesome, we write with 150 MB/sec in the DB so transferring big chunks of data works really well.
When the db kicks in the vectordata performance together with the states is really really slow
With the simulation I used it starts to slow down here:
exportentwork.cpp line 215
the problem is that following code takes hours for 5714 components!
Success ExtractNetwork {31e01691-20ea-4bb5-bbca-68d0fc507d37} Counter 0 time 59668.1
So I was looking a little bit closer into the get component thingy. The problem is here (I added some debug code so it looks slightly different) in the getComponent(it->first) first method.
Here is the Logger Output
DEBUG Sun Apr 14 15:07:12 2013| Start with get components from derived system numberOfComponents 5714 DEBUG Sun Apr 14 15:07:21 2013| Done with 0 DEBUG Sun Apr 14 15:07:30 2013| Done with 1 DEBUG Sun Apr 14 15:07:39 2013| Done with 2 DEBUG Sun Apr 14 15:07:47 2013| Done with 3 DEBUG Sun Apr 14 15:07:56 2013| Done with 4 DEBUG Sun Apr 14 15:08:05 2013| Done with 5 DEBUG Sun Apr 14 15:08:14 2013| Done with 6 DEBUG Sun Apr 14 15:08:23 2013| Done with 7 DEBUG Sun Apr 14 15:08:32 2013| Done with 8 DEBUG Sun Apr 14 15:08:40 2013| Done with 9 DEBUG Sun Apr 14 15:08:49 2013| Done with 10 DEBUG Sun Apr 14 15:08:58 2013| Done with 11
it takes 9 seconds per component!!!!!!!
The cache is huge --nodecache 5000000 --attributecache 5000000 (uses up more than 4 gigs of my ram)
Please look into this I think this is causing all my troubles.