sofwerx / cdb2-concept

CDB modernization
0 stars 1 forks source link

Thoughts on 3D Models in CDB #5

Open christianmorrow opened 3 years ago

christianmorrow commented 3 years ago

High Level Objective: 3D models should be easy to create, share and maintain

Current CDB is lacking as a transmittal format for models. Why?

Therefore, once models are inside a CDB, they become essentially impossible to edit and maintain.

CDB 2.x Possibilities:

ryanfranz commented 3 years ago

I agree with some of this. I want to add some comments to think about.

CDB 2.0 Possibilities

One thing missing here is the distinction between CDB for runtime simulation, and CDB for source repository that was talked about on the miro board. For editing models, keeping them together is really important. For vis/sim, splitting them apart or optimizing the layout to help provide determinism in loading is really important.

christianmorrow commented 3 years ago

Ryan - I agree with your review, almost entirely. Here are a few follow-up comments...

PresagisHermann commented 3 years ago

Excellent exchange with already some convergence on ideas. Let me try to recap the high level ideas and suggest a way forward.

Agreement on the High Level Objective: 3D models should be easy to create, share and maintain

Some of the main points at play are:

• Encoding: Can we replace OpenFlight by glTF?

• Better support the editing case

• Can we optimize rendering performance by grouping data better? • Do we still want vector to point to models (CDB 1.X way) or we want model geometry with attribution (OWT way)? • Tiling • Do we need batching/instancing support? Suggesting a way forward by splitting into 3 investigation areas:
jerstlouis commented 3 years ago

o Zip is a possible container – compressed or not. In Geopackage or outside?

If a GeoPackage is used, could it not itself be the container, with models nicely organized in a models table? Individual models could be compressed (e.g. Draco extension, or zipped glTF).

cnreediii commented 3 years ago

@PresagisHermann - Thanks for this excellent summary of the various CDB 2 discussions! This will help frame ongoing discussions.

christianmorrow commented 3 years ago

Follow up:

While we've agreed the 'backwards compatibility' is not a requirement for CDB 2.0 it does make sense to accept so called backwards compatibility for our accumulated assets - existing 3D models and their textures; the investment in tools (like Creator); and decades of human skills, I DO NOT advocate removing OpenFlight from future CDB, rather for carefully deciding on an additional format or two to see what the up-take might be.

I foresee an overhaul of the 3D model metadata file, expanding its contents, and importance:

A run-time may optionally mull over metadata files (during initialization or as an low priority background task), to quickly inventory discovered models, and inform itself of platform optimization opportunities - or not.

I oppose using a geopackage container for 3D models -- nobody makes geopackage 3D modeling software.

I think I will build up a prototype metadata file, and put it, together with a bunch of sample files into a zip file...and post it as a detailed illustration of how I am imagining this concept!

ccbrianf commented 3 years ago

@christianmorrow While I agree with your high level objective, it also has to be carefully balanced against this one: 3D Models (and all CDB data) must be deterministic to retrieve and process with very low latency on both constrained (ex. handheld) and less constrained (ex. PC server) devices. I am confident we can make strides toward your objectives without totally sacrificing some of the design principals that lead to CDB 1.x (admittedly not always optimal) choices. With respect to this somewhat competing objective, I would like to make the following comments on your points:

CDB 2.X Possibilities comments:

Replies to later comments:

@PresagisHermann I never completely understood the one file format per dataset rule; it seems rather idealistic more than utilitarian. I think it would be fine to combine related data, and if necessary in differing formats.

@jerstlouis The container discussion is more akin to what the blob type is for a GeoPackage table entry, which may or may not be a GLTF variant. Yes, GeoPackage is a superset container.

kevinbentley commented 3 years ago

@ccbrianf 3D Models (and all CDB data) must be deterministic to retrieve and process with very low latency on both constrained (ex. handheld) and less constrained (ex. PC server) devices.

In my opinion, that shouldn't a CDB 2.0 requirement. Nice to have maybe, but I don't think we'll be able to modernize CDB and do what you want simultaneously.

For your use-case, I think you should pre-process the CDB to make it fit your idea of determinism. To me, that's a specialization problem and the optimizations that your IG needs don't necessarily apply to anyone else. There's no reason you can't tile/chop/convert data to fit your needs.

How many IGs are being used that really fly directly off of CDB with nothing in between the visuals and the data? I'm not saying that there are none, but I believe that's a very small percentage of the overall CDB user base.

ccbrianf commented 3 years ago

@kevinbentley You might want to review CDB volume 0 as Carl suggested we all do prior to this effort: https://github.com/opengeospatial/cdb-volume-0/blob/master/clause_6_BackgroundAndInformativeMaterial.adoc

In my opinion, which is really the opinion of the CDB originators (which I was not a part of), as evidenced there (which was primarily an expression of SOCOM interests at the time), it's not CDB without the determinism requirement. Every application benefits from this if it is at all interested in performance or resource constraints, of which most are at some level, short of supercomputers. In the mod/sim arena, CDB might as well be NPSI without it then because all it becomes is standardized source data attribution and metadata arbitrarily wrapped up in a modern OGC container. It will no longer be suitable for the performance use cases for which it was originally intended.

I do however continually conceded (to be reasonable) that if we think this is somehow too restrictive for the data repository (which it is) or other use cases, and we are content to forgo plug and play interoperability (I'm sure @PresagisHermann could provide you with a moderately complete list of CDB runtime capable devices), then we can confine this constraint to applicable usage domains (I actually think it will eventually become most of them). But if we don't have a standardized profile for that we can mostly agree on, or if every application has to reformat the data into its own proprietary optimized structure, then we have completely reverted and might as well go back to proprietary formats with long offline data preparation times because that is what it really becomes (but it will all be called CDB, sic). (Trust me, if I were trying to optimize for my use case or IG, I would be representing a completely different viewpoint that would only vaguely resemble the CAE legacy. I'm only trying to defend the original CDB design principles and user base idealistically while still supporting future growth and adaptability).

I honestly don't see why modernization is opposed to determinism. To me, modernization is about supporting ever increasing content capacity and fidelity, modern/new data formats, and fixing or improving historical design choices, while still maintaining performance and expanding usage domains without excluding existing ones. Can you enumerate how you would define CDB 2.0 that differs from that viewpoint? Back in the SWIG, I stressed that we can't move CDB forward without deciding what CDB is, or needs to become. I'm pretty sure I still haven't heard a clear and concise definition for that, which is partially why it continues to fracture into "profiles" and/or dissensions. Most of the new players have specific desires, but a lot are not at all, or are poorly articulated, IMHO. We all tend to see the end we want rather than the principal we are trying to achieve.

Tiling and regular LOD resolutions are not in any way an IG/simulation specific need, but I grant you that having only that representation available is not well suited for maintenance and some other use cases.

jerstlouis commented 3 years ago

I strongly agree with @ccbrianf on this, and disagree that we can't both modernize and maintain determinism. Determinism is the strength of CDB, and consistent tiling of all types of data goes a long way towards achieving this, and this facilitates handling the data on the entire spectrum of use cases for production, storage, dissemination as well as the constrained edge use cases.

kevinbentley commented 3 years ago

Maybe we should discuss what determinism means. @ccbrianf, when we have talked about GeoPackage, you have raised the concern that you didn't feel that spatially indexed GeoPackage files could be accessed deterministically unless you could guarantee that a single disk IO could give you the features you wanted. So when you talk about determinism, I'm interpreting that as "I always can predict how long it will take to read N features".

If determinism means you have to limit features (whether models or vectors) to a set size/vertex limit, I don't think that is compatible with the modernization we need.

When it comes to consistent tiling, data models, etc. I absolutely agree that we need that. If we're calling that determinism, I think it is a good thing.

One more thing in reference to the CDB volume 0, this is a discussion about CDB 2.0, so I think it's perfectly valid to revisit or ignore past assumptions. We are not looking for backwards compatibility. I believe it's still an open issue whether we should call what we're doing CDB 2.0, or if it should be an entirely new name. The important part is to meet the modern requirements of SOCOM/NGA/etc.

jerstlouis commented 3 years ago

@kevinbentley If we consider multi-resolution tiling, and LOD of vector / models / imagery being selected based on the size that those tiles project to on a rendered image. Beyond a certain amount of detail for that LOD, additional vertices or texture resolution does not contribute to a better rendering.

In my opinion the fixed limit should be set at that point, so that with the argument about 128 vs 129 vertices, at 128 vertices you likely already have way too many vertices, so that the drop that spills with 129 is not an issue as it is simply intended to trigger the detection of severely breaking the determinism rules.

Also implementation-wise, OpenGL ES and WebGL based on it, have fairly low limits e.g. 16-bit indices, so limiting the amount of data somewhere can help organize the rendering engine to cope with those limits.

ccbrianf commented 3 years ago

@kevinbentley Aspects of determinism: an application always knows the following:

I do not require a single IO to retrieve the features an application wants, but that is certainly the most optimal case. I do want to make sure we consciously decide whenever we significantly increase the number of IOs in CDB 2.0 with respect to what would have been required in CDB 1.x to retrieve the same data because that heavily impacts application performance. We should be sure the value gained for the compromise is significant enough to warrant the penalty.

It is perfectly valid to revisit everything in CDB as long as we generally agree on the impacts of those changes, both with respect to current usage domains, and with respect to how the "core personality" of CDB will differ from its 1.x predecessor. Any changes to the latter though I think should be clearly articulated as fundamental design philosophy changes about what CDB is with general group consensus. I don't think it's ok to just ignore those things without such a discussion, some of which has been had, and some not so much.

What is backwards compatibility? We all agree CDB 2.0 will have new data formats and structures, both on disk and likely with its tiling structure, thus breaking backward compatibility. Do we all agree that we don't have to preserve existing application domains in some form, even if that form is a standardized CDB "profile" deviation for that domain? To me, that is an important part of backward compatibility that I don't think CDB 2.0 should sacrifice or it really should have a new name because it is no longer related to the old thing called CDB. Has SOCOM's needs so radically changed that these domains are no longer of concern for CDB 2.0, or have they only added new domains and use cases to the prior list?

It would be helpful if the SOCOM and NGA requirements were more spelled out rather than just assumed, because it might lead to some surprising conclusions. I'm not sure I understand all the NGA use cases, but if it is primarily data warehousing, quality control, enhancement/temporal currency maintenance, etc. and less online streaming access, I wouldn't necessarily require the determinism aspects of CDB for that use case. But given the shear amount of data being stored and managed, I can certainly see why it might still be worth paying the application complexity cost to access it that way.

@jerstlouis It would be best if we try to avoid discussion of rendering and stick to real world resolution, which is a point of commonality for all usages (and still illustrates your point I believe).

Yes, the vertex limits are already intended to be just as you describe. For 3D model geometry LODs I believe they already are. For GS 3D model tile bundles, I believe CDB is off by approximately 2 quad tree depth levels of tile size subdivision for any given resolution of data. This has been previously extensively discussed as something to address in the next CDB version.

There are many practical applications for well chosen determinism limits in data structures, only one of a near infinite number is discussed here by Jerome.

cnreediii commented 3 years ago

I have created two new issues: What is CDB 2.0? and "Determinism and CDB 2.0".

Please move your discussion on those topics to the new issues (7 and 8):

https://github.com/sofwerx/cdb2-concept/issues/8 https://github.com/sofwerx/cdb2-concept/issues/7

These are both important discussions and we need consensus. Easier to track the discussion by using these two new issues.

cnreediii commented 3 years ago

On this statement: The important part is to meet the modern requirements of SOCOM/NGA/etc.

Well, sure, SOCOM and NGA have $$ and requirements. While the focus of the current work is to meet SOF etc requirements, there is also an existing large CDB user community that cannot be ignored. Their requirements and use cases will help us frame CDB 2.0 design objectives and requirements. I suspect many of those use cases/requirements are consistent with SOCOM/NGA requirements but their will also be other requirements or classes of requirements that are mandatories for the current user community and that NGA does not care about.

vwhisker commented 3 years ago

The 3D modeling sub-group met yesterday with some experimentation results presented by Presagis (Anthony and Herm). There is a document on GitHub (https://github.com/sofwerx/cdb2-concept/blob/3Dmodels_prototype/3Dmodel_Prototype/CDB%20and%20OpenFlight_glTF_analysis.docx) that describes the cross-walk between current functionality for OpenFlight and glTF. There are a number of existing extensions for glTF that mimic some of the features of OpenFlight, but a CDB extension will probably be required. The experimentation performed by Presagis shows a simple OpenFlight model with 3 LODs converted to glTF using the MSFT_LOD extension to encode the LOD nodes. This model was successfully opened in Blender, however Blender cannot interpret the extensions, so the hierarchy is incorrect. This experimentation reveals some potential issues with encoding models and then editing them with standard 3D model editors (Blender, Max, Maya, etc), since the tools are not mature yet. This may pose some challenges when models are edited/changed/updated. Here are some shots from the miro board:

image

image

image