Closed ashwinbhat closed 4 weeks ago
I've added a catch benchmark to repro here Each sample loads about 20 documents, validates them and generates shader for them. When --benchmark-samples 10, we are loading 200 documents.
MaterialXTest.exe "GenShader: Scalabilty Check" --benchmark-samples 10 Filters: GenShader: Scalabilty Check
MaterialXTest.exe is a Catch v2.13.10 host application.
Run with -? for options
-------------------------------------------------------------------------------
GenShader: Scalabilty Check
-------------------------------------------------------------------------------
F:\source\MaterialX-adsk-fork-current\source\MaterialXTest\MaterialXGenGlsl\GenGlsl.cpp(100)
...............................................................................
benchmark name samples iterations estimated
mean low mean high mean
std dev low std dev high std dev
-------------------------------------------------------------------------------
Load documents and ShaderGen 10 1 1.8764 s
196.507 ms 191.65 ms 205.391 ms
10.2555 ms 4.27542 ms 15.2796 ms
4.284 s: GenShader: Scalabilty Check
===============================================================================
All tests passed (88 assertions in 1 test case)
Similar test with 100 samples (2000 documents)
GenShader: Scalabilty Check
-------------------------------------------------------------------------------
benchmark name samples iterations estimated
mean low mean high mean
std dev low std dev high std dev
-------------------------------------------------------------------------------
Load documents, validate and
generate shader 100 1 4.52616 m
2.50697 s 2.49358 s 2.52304 s
74.617 ms 63.377 ms 90.9251 ms
256.122 s: GenShader: Scalabilty Check
===============================================================================
All tests passed (8888 assertions in 1 test case)
Clarifying questions:
These are great notes and observations, thanks @ashwinbhat!
As one initial proposal, perhaps we could add support for passing library documents directly to the shader generator, rather than requiring content and library documents to be merged in advance?
Yes, I believe we could improve many workflows by defining the notion of a library document that is referred to for validation, shader generation, baking and other workflows. I think this might also help with USD integrations where the library document will be registered and referred to as needed. Separating the library document and the material document will be a rather significant API change.
@jstone-lucasfilm I did a quick prototype by introducing an API to allow registering node library. This seems promising i.e. validation and shader generation seem be to ok. I have not run full suite of tests yet, but wanted to ask if there will be more interest in this approach. see https://github.com/autodesk-forks/MaterialX/commit/88cb02fa18ecfc0c638c576a52a8ff81ad453085#diff-446cebc815ec59703ca40bb07af4fe92e68aa357d5e4c3f3e53e3d90f1dc8f54
Running the benchmark test with 1000 samples i.e. 20 x 1000 documents.
benchmark name samples iterations estimated
mean low mean high mean
std dev low std dev high std dev
-------------------------------------------------------------------------------
Load documents, validate and
generate shader 1000 1 5.19299 m
288.903 ms 286.796 ms 291.134 ms
34.9009 ms 33.2086 ms 36.8215 ms
297.767 s: GenShader: Scalabilty Check
Profile report shows the hotspot are no longer the issue.
Wow, that looks very promising, @ashwinbhat! I would probably recommend the name "dataLibrary" instead of "nodeLibrary", but otherwise this definitely looks like a worthwhile proposal for the main repository.
Hey @ashwinbhat , Very interesting what you've found here. Seems the big gain is to reuse the same in memory document for reference definitions.
Note that you will probably want to watch out that definitions which reside in other documents must also be found, and that it may be useful to cache a set / list of libraries instead of just one. As such the search logic could be made more robust by searching in these reference libraries first and if not found then try the local document. Does this sound reasonable ?
Some other thoughts pop to mind: who owns "the" reference librarie(s) ? They are referenced by 1 or more documents but they don't own it. Also if it can be dynamically settable this can cause problems if cleared or set to another library file. I can see it's clean to have a Node API hide the logic but wondering if it should be something more explicitly passable instead? Also, I can see the workflow where a library can be embedded into a saved document would need to be considered as it's no longer intrinsically part of the document -- which is actually probably a good thing.
My final 2.5 cents is that for previous work, Niklas and I added a "scene" concept which encaspulated 1 or more documents + reference libraries but this of concept of course does not even exist here...
Hey @bernardkwok Modifying the search logic to search Data library first and fallback to local sound reasonable. It will probably help with the embedded workflow, though I think we should try to define more clearer workflows and understand why embedding a library in document would be necessary.
Ownership of the data library is undefined. In the current workflow an application can load a set of libraries, import it into a document. It can then load another set of libraries for another document. Though this is flexible I'm not sure how practical this workflow is. By requiring an API that ensures the datalibrary is set once, will help setting a clear state. Is there a clear use-case where applications might want to dynamically set datalibrary between document loads? I thought about the option to pass around the datalibrary but then it would be up to the application to keep track of them. But I guess my key question here is do we foresee a use-case where there are multiple instances dataLibraries?
The latest functionality that @ashwinbhat has added in #2054 should address this performance issue, so I'm closing out this original report.
Thanks for all of your excellent work on this, Ashwin!
We notice scalability issues with scenes containing 500+ MaterialX documents. MaterialX load, validation and code generation takes longer than expected. Current bottleneck seems to be in Document::importLibary which is required for every mtlx file load.