Massive spatial fields - Githubissues

LDeakin commented 10 months ago

Anari requires data for a structuredRegular spatial field to be supplied as an array. This precludes implementations that support rendering massive volumes by streaming data from disk.

I could see this being addressed by allowing data also to accept a STRING indicating the path to data, where responsibility is left to the implementation to retrieve the data. EDIT: Or a new type of spatial field. Of course, there are many volumetric data file formats, and realistically, there would be little overlap between the formats supported by implementations (if they even support any), so I don't know if it is a good fit for the ANARI API. Thoughts?

jeffamstutz commented 10 months ago

Hi, thanks for starting this thread! Out-of-core rendering is something that ANARI can express through new extension(s), but I want to shape the perspective a bit about how that could (should?) be approached.

As a preface -- it's worth noting that any ANARI back end is always free to ship "vendor extensions", which just means its an extension specified and shipped strictly from a vendor and doesn't live (yet) in the official spec. There are plenty of examples of this: I only bring it up to say that if an ANARI back end and application agree on an extension, the "work can get done" without the spec standing in the way. In fact, existing practice is the fastest way to get something standardized, so getting that working first is very much encouraged! But I suspect the question is more aimed at understanding how such an extension might be defined, and what other implementations might think about it such that its defined in a way for them to be motivated to implement it...so I'll roll with that assumption for this discussion.

In general, I think the best place to start is with understanding the context of why a feature ought to exist -- not from a perspective of saying "yes" or "no" to an idea, but rather making sure existing features are truly exhausted such that the necessary extensions (or enhancements to existing ones) emerge and thus have a describable context in which they ought to be used.

Out-of-core volume rendering is a very real use case, though the extent to which users rely on it is something I don't observe to be common (no examples come to mind, though I acknowledge such renderers exist). I think out-of-core rendering is the key to the scope of doing disk streaming, as the spirit of ANARI's design is for renderers to avoid file I/O. In other words, I think this must be strictly differentiated from what some users have come to ANARI saying "I expect my renderer to parse data files" -- but that's coming from a misunderstanding of the level of abstraction ANARI implements compared to a full-on scene management library (i.e. scene graph or data management library) because parsing things like OBJ files is truly independent of the system which would render triangle meshes. Thus if an extension brings in the idea of doing file I/O through the API, I argue it ought to only do so because there is no other option for the particular use case -- namely out-of-core rendering.

The next question that begs is "where is the right place to express a file based resource?". I personally tend to start with the most general place, then move it to a more specific place if it doesn't truly fit in the general spot. In this case, I think considering a "file stream array" would be quite interesting -- if you had something like ANARIArray3D anariNewFileStreamArray3D(ANARIDevice, FILE *, size_t offset, /*etc...*/), that would give you an array handle which could be plugged into the existing places ANARIArray3D would go. This minimizes duplicating the definition of structuredRegular spatial fields, and (hopefully?) even gets out of the business of dealing with different file formats because it's ultimately just looking for a brick of bytes (just on disk instead of in memory). I think this has a strong advantage of then accomplishing the desired feature (out-of-core structured volume rendering) without trying to move ANARI toward "scene file parsing" objects that push implementations toward the fragmented problem of parsing a zillion data formats (I think those are best to live above the API and use ANARI instead).

I'll stop there for now, but I think it's a good technical discussion to have with an interesting use case behind it!

Do you have an existing renderer that does out-of-core structured volume rendering and want to possibly use ANARI as its interface?

LDeakin commented 10 months ago

Thanks for the insight; a new way to create an ANARIArray3D streamed from disk makes sense. Although, I am not sure simply having the bytes on disk instead of memory would be very practical. Suitable array storage formats for out-of-core rendering support things like

sparse data,
chunking, and
multiple resolutions.

Also, the data may be split over many files. I don't see a path outside of letting the back end handle the I/O freely.

Do you have an existing renderer that does out-of-core structured volume rendering and want to possibly use ANARI as its interface?

Yes, I have one and I've been thinking about adding ANARI support before I release it publicly.

jeffamstutz commented 10 months ago

Yes, I have one and I've been thinking about adding ANARI support before I release it publicly.

Great! I (and others) would love to help make that happen. Even if you initially start with some custom extensions that only your renderer supports, it's still very helpful to get into the "ANARI ecosystem".

Although, I am not sure simply having the bytes on disk instead of memory would be very practical. Suitable array storage formats for out-of-core rendering support things like ...

I think it would be good for me (and others on the WG) to learn about more details of the current problems your use case(s) face. There's a number of ways to solve the problem of rendering huge data, and we'd love to explore those in depth within the context of ANARI's abstractions as this overlaps with the related problem of data-parallel distributed rendering (several implementations are working on this already).

It would be best to invite you to join our Advisory Panel so that any deeper technical discussions are done within Khrono's protected IP framework. That is really only about what text makes it into the spec, but ideally whatever we would do in practice with software (we are totally free to do whatever we want there) would eventually make it into the spec. I'll start an email thread with you on those details.

KhronosGroup / ANARI-Docs

Massive spatial fields #119