Open zhoub opened 8 years ago
Cool ... revisions handle well in Milliways ?
We are testing But the speed is amazing. Now I will log a couple of questions for @panta.
So, first question is about size, I think we could get a better size if we tweak the block size. What do you think @panta?
My second question is about the file. With Milliways we have everything in one file (say archive.mwdb). Now it would be great if this file could be archive.abc so that the repository is not a directory anymore. I understand this may not be possible since this is handled by libgit/git but anyway I want to ask if you have any idea.
Well, for starters, speed is not yet amazing imho. I think I can do better, maybe better than Ogawa, the code is not yet optimized for speed :)
Then, regarding block size: we can experiment but probably 4k is the best size for disk I/O performance. We can improve space utilization by using a more advanced logic in space allocation, or by using some really fast compression algorithm, like LZ4. To keep in mind: when changing block size it's necessary to change also the B+-Tree B factor (currently 68, see BTreeFileStorage_Compute_Max_B() function).
About the file, maybe it's possible but we'd need to store also the refs db inside milliways (quite easy) and probably to modify libgit2 to dispense completely of the .git directory (no idea of how complex), which means also keeping the .git/config and other amenities inside milliways. We need to investigate this further.
Speed: it would be awesome if we can read faster than ogawa. Really it would be insane.Do your best.
Block size: 4k (& B+ tree factor) is the best for I/O on all operating systems? Remember that windows is the one that suffers the most. On the other hand with the new SSD storage this gets minimized a lot.
Advanced space allocation / LZ4: whatever can be done fast I am up for it.
File: let's put this in the backburner for now, but maybe you could ask in the git/libgit mailing list to gain the info.
Ok, latest commit in panta/pre-alpha/optimizations-1 is about 7x than the initial version. In multiverse there will be some other overhead, but should be significantly faster than before. There are other smaller optimizations possible. Let me know.
I am literally drooling.
Hello Marco,
ok we now have a build cross platform.
Read/write performance seems good (let's do some profiling later and figure out if things can be improved). Right now we have some caches where the size is actually larger. Can you try it on the optimusBotOgawa.abc
? We have 1GB (milliways) vs 768MB (ogawa). Note that this is just one commit, we should gain from the second commit (except if topology changes completely).
24 frames, 2 motion samples. Done in Maya 2015, MacBook Pro with SSD.
Write Time | Size 1st commit (Size on Disk) | Size after 2nd commit | Read Time | |
---|---|---|---|---|
HDF5 | 7.4 s | 136.5 MB | 273 MB | 1.4 s |
Ogawa | 6.4 s | 134.4 MB | 269 MB | 1.1 s |
Git | 18.2 s | 69 (95.6) MB | 9.2 s | |
Git Milliways | 9.7 s | 157.3 MB | 157.6 MB | 3.1 s |
96 frames, 2 motion samples. Done in Maya 2015, MacBook Pro with SSD.
Write Time | Size 1st commit (Size on Disk) | Size after 2nd commit | Read Time | |
---|---|---|---|---|
HDF5 | 22.0 s | 305.6 MB | 611 MB | 5.9 s |
Ogawa | 15.9 s | 287.7 MB | 576 MB | 3.8 s |
Git | 56.6 s | 169.5 (253) MB | 338 MB (499.6) MB | 15.9 s |
Git Milliways | 25.8 s | 405.3 MB | 810.5 MB | 8.6 s |
1 frame, 1 motion sample. Done in Maya 2015, MacBook Pro with SSD.
Write Time | Size 1st commit (Size on Disk) | Read Time | |
---|---|---|---|
HDF5 | 139.0 s | 231.5 MB | 960 s |
Ogawa | 99.0 s | 41.2 MB | 904 s |
Git | 820.1 s | 57.7 (1350) MB | undone |
Git Milliways | 173.7 s | 96.4 MB | 994 s |
Added more results.
it's definitely better than with the "classic" git backend, except for the worse space utilisation, but I think there is still a considerable margin for improvement and many optimization opportunities. Here are some possible optimization opportunities I've identified:
The problem I see is that the space at the first commit is always larger than Ogawa. That is a bit of a show stopper because 80% of all Alembic assets (and even more) will only have one commit. Am I missing something here?
I have no doubts there is large space for optimization, let's move onto them! :+1:
Some comments:
Further edits on my comments.
regarding the space utilisation, we can improve also there. I am focused on speed now, but after that I'll tackle this too.
yes, seriously is the name, seriously :) And yes, when converting to "classic" git, we have to use the old format, JSON and all (it's too useful to be able to edit a text format from time to time).
@panta according to https://github.com/libgit2/libgit2/issues/3566 we are not using any compression. My opinion is that right now zlib is not even used, we need to experiment putting in lz4 or whatever compression it makes sense as this will affect performance.
Platform Windows
Test file: Zombie Static
https://drive.google.com/open?id=0B-wMG_5skSbASk5DamxyV2FpMEE 3M
Local Git
12.206579s wall, 0.499203s user + 3.151220s system = 3.650423s CPU (29.9%)
Milliways
0.323581s wall, 0.312002s user + 0.015600s system = 0.327602s CPU (101.2%)
_store.mwdb 3.2M_
Ogawa
0.282190s wall, 0.109201s user + 0.171601s system = 0.280802s CPU (99.5%)
Test file: Zombie HDF5 2sec
https://drive.google.com/open?id=0B-wMG_5skSbASmE4MHpTbUZRVm8 196M
Local Git
238.399374s wall, 37.689842s user + 52.572337s system = 90.262179s CPU (37.9%)
Milliways
38.366685s wall, 35.147025s user + 3.198021s system = 38.345046s CPU (99.9%)
_store.mwdb 264M_
Ogawa
22.015381s wall, 11.013671s user + 10.951270s system = 21.964941s CPU (99.8%)