threedle / text2mesh

3D mesh stylization driven by a text input in PyTorch
https://threedle.github.io/text2mesh/
MIT License
925 stars 129 forks source link

Load different clip models, jit option #24

Closed factoryofthesun closed 2 years ago

factoryofthesun commented 2 years ago

Update parser options to take in any CLIP model + with JIT. The render image resolution is updated according to the model specifications. Note: as new CLIP models become public, for the sake of efficiency it will probably be good to just replace the render preprocessing (e.g. setting the render resolution) with the preprocess function given in the CLIP model loading. We can just set the default render resolution to be the current max of the models to prevent too much blurring from upsampling.

Results shown on the ninja with each model with the same options + seed: RN50

Screen Shot 2022-08-26 at 12 00 09 PM

RN101

Screen Shot 2022-08-26 at 12 00 23 PM

RN50x4

Screen Shot 2022-08-26 at 12 00 32 PM

RN50x16

Screen Shot 2022-08-26 at 12 00 47 PM

RN50x64

Screen Shot 2022-08-26 at 12 00 57 PM

VIT-B/32

Screen Shot 2022-08-26 at 12 01 08 PM

VIT-B/16

Screen Shot 2022-08-26 at 12 01 16 PM

VIT-L/14

Screen Shot 2022-08-26 at 12 01 24 PM

VIT-L/14@336px

Screen Shot 2022-08-26 at 12 01 31 PM