Thanks for your wonderful work! I have a question that you said you based on pre-trained SVD to train the first stage Hi3D model. But the original SVD used FPS and bucket_id as the additional condition together with timestep embedding, and the dimension is 768. Here in the configuration file, I see you changed these two condition with elevation and aesthetic condition but starting with the same label embedding. Do you think this works well? Thanks!
Thanks for your wonderful work! I have a question that you said you based on pre-trained SVD to train the first stage Hi3D model. But the original SVD used FPS and bucket_id as the additional condition together with timestep embedding, and the dimension is 768. Here in the configuration file, I see you changed these two condition with elevation and aesthetic condition but starting with the same label embedding. Do you think this works well? Thanks!