ashawkey / stable-dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Apache License 2.0
7.99k stars 710 forks source link

What's the state of the art on Nerf compression? #185

Open LifeIsStrange opened 1 year ago

LifeIsStrange commented 1 year ago

@ashawkey friendly ping We are reaching a point where text/2D to 3D assets are of sufficiently high quality to be used as assets in indie video games. The issue as for the practicality of this is at least two folds: 1) the performance of the renderer which I believe is a mostly solved problem at least I've seen many papers claiming > 30FPS 2) the size of the 3D asset. I believe that current 3D generated assets of relatively simple objects have considerable weight (MB? GB?) but I also believe this is highly compressable. for exemple in many games one can afford to have "hollow" solid objects where only the surface is drawn and where the points inside the object are empty since they're not visible. Nerf 3D generators fills the interior of objects but I believe those mostly useless points could be removed as a post prossessing step.

There are other kinds of optimisations that might be possible such as some artefacts smoothings, and conditional rendering of pixels based on clipping/occlusion. One might even try to train a neural network for the task of compressing a nerf 3D asset without reducing its perceived accuracy or 3D external boundaries, or simply develop a classical optimizer.

So whats the state of the research on making 3D assets actually usable in video games with a reasonable size?

Ainaemaet commented 1 year ago

Plenty of tools available that will hollow out a model if you can't do it manually. Personally though, unless for some reason you would rather spend long periods of time letting the computer generate background-assets than bang them out yourself, I would wait for Nvidia Picasso.

Who knows tho? This is all happening so very fast.

LifeIsStrange commented 1 year ago

@Ainaemaet Thank you! I had not heard of Picasso and it looks very promising from a user experience standpoint, though I wonder if the accuracy of https://fantasia3d.github.io/ is better.

unless for some reason you would rather spend long periods of time letting the computer generate

Fantasia3d currently take "on average" 30-35 minutes to generate a 3D asset with a 8X RTX3090 gpus setup I wonder how fast picasso will be. I also believe that researchers have not yet attempted to improve inference/generation time, for example they might benefit significantly from deepspeed including INT8 precision or even maybe INT4, and LORA or NERF specific optimisations.

note: as for tools to hollowing out it seems meshmixer is the most popular although the process is not fully automatic https://www.youtube.com/watch?v=gAPqYrmheV0