zhoub commented 8 years ago

Platform Windows

Test file: Zombie Static

https://drive.google.com/open?id=0B-wMG_5skSbASk5DamxyV2FpMEE 3M

Local Git

12.206579s wall, 0.499203s user + 3.151220s system = 3.650423s CPU (29.9%)

Milliways

0.323581s wall, 0.312002s user + 0.015600s system = 0.327602s CPU (101.2%)

_store.mwdb 3.2M_

Ogawa

0.282190s wall, 0.109201s user + 0.171601s system = 0.280802s CPU (99.5%)

Test file: Zombie HDF5 2sec

https://drive.google.com/open?id=0B-wMG_5skSbASmE4MHpTbUZRVm8 196M

Local Git

238.399374s wall, 37.689842s user + 52.572337s system = 90.262179s CPU (37.9%)

Milliways

38.366685s wall, 35.147025s user + 3.198021s system = 38.345046s CPU (99.9%)

_store.mwdb 264M_

Ogawa

22.015381s wall, 11.013671s user + 10.951270s system = 21.964941s CPU (99.8%)

aghiles commented 8 years ago

Cool ... revisions handle well in Milliways ?

pberto commented 8 years ago

We are testing But the speed is amazing. Now I will log a couple of questions for @panta.

pberto commented 8 years ago

So, first question is about size, I think we could get a better size if we tweak the block size. What do you think @panta?

My second question is about the file. With Milliways we have everything in one file (say archive.mwdb). Now it would be great if this file could be archive.abc so that the repository is not a directory anymore. I understand this may not be possible since this is handled by libgit/git but anyway I want to ask if you have any idea.

panta commented 8 years ago

Well, for starters, speed is not yet amazing imho. I think I can do better, maybe better than Ogawa, the code is not yet optimized for speed :)

Then, regarding block size: we can experiment but probably 4k is the best size for disk I/O performance. We can improve space utilization by using a more advanced logic in space allocation, or by using some really fast compression algorithm, like LZ4. To keep in mind: when changing block size it's necessary to change also the B+-Tree B factor (currently 68, see BTreeFileStorage_Compute_Max_B() function).

About the file, maybe it's possible but we'd need to store also the refs db inside milliways (quite easy) and probably to modify libgit2 to dispense completely of the .git directory (no idea of how complex), which means also keeping the .git/config and other amenities inside milliways. We need to investigate this further.

pberto commented 8 years ago

Speed: it would be awesome if we can read faster than ogawa. Really it would be insane.Do your best.

Block size: 4k (& B+ tree factor) is the best for I/O on all operating systems? Remember that windows is the one that suffers the most. On the other hand with the new SSD storage this gets minimized a lot.

Advanced space allocation / LZ4: whatever can be done fast I am up for it.

File: let's put this in the backburner for now, but maybe you could ask in the git/libgit mailing list to gain the info.

panta commented 8 years ago

Ok, latest commit in panta/pre-alpha/optimizations-1 is about 7x than the initial version. In multiverse there will be some other overhead, but should be significantly faster than before. There are other smaller optimizations possible. Let me know.

pberto commented 8 years ago

I am literally drooling.

pberto commented 8 years ago

Hello Marco,

ok we now have a build cross platform. Read/write performance seems good (let's do some profiling later and figure out if things can be improved). Right now we have some caches where the size is actually larger. Can you try it on the optimusBotOgawa.abc? We have 1GB (milliways) vs 768MB (ogawa). Note that this is just one commit, we should gain from the second commit (except if topology changes completely).

pberto commented 8 years ago

Katana Robot Testing (anime keyframes, but no deform in robot)

24 frames, 2 motion samples. Done in Maya 2015, MacBook Pro with SSD.

	Write Time	Size 1st commit (Size on Disk)	Size after 2nd commit	Read Time
HDF5	7.4 s	136.5 MB	273 MB	1.4 s
Ogawa	6.4 s	134.4 MB	269 MB	1.1 s
Git	18.2 s	69 (95.6) MB		9.2 s
Git Milliways	9.7 s	157.3 MB	157.6 MB	3.1 s

Zombie (Full Deform)

96 frames, 2 motion samples. Done in Maya 2015, MacBook Pro with SSD.

	Write Time	Size 1st commit (Size on Disk)	Size after 2nd commit	Read Time
HDF5	22.0 s	305.6 MB	611 MB	5.9 s
Ogawa	15.9 s	287.7 MB	576 MB	3.8 s
Git	56.6 s	169.5 (253) MB	338 MB (499.6) MB	15.9 s
Git Milliways	25.8 s	405.3 MB	810.5 MB	8.6 s

40K cubes (copies, not instances)

1 frame, 1 motion sample. Done in Maya 2015, MacBook Pro with SSD.

	Write Time	Size 1st commit (Size on Disk)	Read Time
HDF5	139.0 s	231.5 MB	960 s
Ogawa	99.0 s	41.2 MB	904 s
Git	820.1 s	57.7 (1350) MB	undone
Git Milliways	173.7 s	96.4 MB	994 s

pberto commented 8 years ago

Added more results.

panta commented 8 years ago

it's definitely better than with the "classic" git backend, except for the worse space utilisation, but I think there is still a considerable margin for improvement and many optimization opportunities. Here are some possible optimization opportunities I've identified:

hint block and node cache operations with operation kind (i.e. to avoid reading a block from disk if the block is to be written immediately after with new contents)
explicitly cache last search result in key value store operations (since almost always git performs a sequence of has() - get() - put())
wire down in caches blocks and nodes used every time
evaluate different cache architectures
check if eliminating shared pointers and direcly use actual block and nodes could provide benefits
when using milliways as a backend skip JSON (and maybe even msgpack) and use our fast binary serialization
misc minor optimizations

aghiles commented 8 years ago

The problem I see is that the space at the first commit is always larger than Ogawa. That is a bit of a show stopper because 80% of all Alembic assets (and even more) will only have one commit. Am I missing something here?

pberto commented 8 years ago

I have no doubts there is large space for optimization, let's move onto them! :+1:

Some comments:

even the classic git backend had troubles with production scenes, you can see how it starts to go near ogawa in size in the robot and it completely dies in the 40K cubes nightmare.
Ideally I would like Milliways to be the default writing choice, is the optimizations Marco suggests are going to work nice, it will be the natural choice. Basically the classic git backend cannot be used for large assets... I would even remove it from the choice of blackened form the DCC App UI (still leave it for conversion purposes).
personally I think it would be nice to bypass JSON/msgpack and use seriouslyTM (is that the name no? :) ) one question though: would it be still possible to abcconvert to a classic git cache after?

pberto commented 8 years ago

Further edits on my comments.

panta commented 8 years ago

regarding the space utilisation, we can improve also there. I am focused on speed now, but after that I'll tackle this too.

yes, seriously is the name, seriously :) And yes, when converting to "classic" git, we have to use the old format, JSON and all (it's too useful to be able to edit a text format from time to time).

pberto commented 8 years ago

@panta according to https://github.com/libgit2/libgit2/issues/3566 we are not using any compression. My opinion is that right now zlib is not even used, we need to experiment putting in lz4 or whatever compression it makes sense as this will affect performance.

j-cube / milliways

Benchmark #2

Platform Windows

Test file: Zombie Static

Local Git

Milliways

Ogawa

Test file: Zombie HDF5 2sec

Local Git

Milliways

Ogawa

Katana Robot Testing (anime keyframes, but no deform in robot)

Zombie (Full Deform)

40K cubes (copies, not instances)