yisuanwang / Ultraman

Ultraman: Single Image 3D Human Reconstruction with Ultra Speed and Detail
https://air-discover.github.io/Ultraman/
11 stars 0 forks source link

Release date #1

Open NeuroLord opened 3 months ago

NeuroLord commented 3 months ago

I'd like to know at least an approximate release time. A week, a month, a year? The demos look very good, I would extremely like to try the model on my own data

By the way, what are the requirements of your code? Would it be possible to generate an avatar on a CPU, or would a Tesla with 40 or 80 GB of memory be required?

yisuanwang commented 3 months ago

Thank you for your interest, due to the heavy academic commitments of our team members, we expect to upload the full code and a working colab within two weeks. @tomorrow1238

We are using a 24G 3090 for mesh reconstruction tasks and an 80G A100 for generating multi-view images and textures. The GPU limitation of the latter is a result of the SD-XL step of generating images.

To add to a problem we encountered: since the part about sdxl generating multi-view images would show oom when run on a 24G 3090, we moved the subsequent steps to an 80G A100. However, based on the experiments in the intermediate steps, it is recommended that a GPU of 32GB or more is required.

Also, due to access restrictions for GPT-4V in some regions, we will use blip for VQA in the open source code.

Thanks again for your interest :)

NeuroLord commented 3 months ago

We are using a 24G 3090 for mesh reconstruction tasks and an 80G A100 for generating multi-view images and textures. The GPU limitation of the latter is a result of the SD-XL step of generating images.

To add to a problem we encountered: since the part about sdxl generating multi-view images would show oom when run on a 24G 3090, we moved the subsequent steps to an 80G A100. However, based on the experiments in the intermediate steps, it is recommended that a GPU of 32GB or more is required.

Is it possible to perform all the steps on the CPU in a reasonable amount of time? Perhaps overnight or 24 hours

yisuanwang commented 3 months ago

The step of reconstructing the body mesh and generating images doesn't seem to be possible on a cpu. Unless you implement it using the api interface. The subsequent texturing step can be done on the cpu.