EnVision-Research / LucidDreamer

Official implementation of "LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching"
MIT License
749 stars 32 forks source link

Some questions about the algoritm. #2

Closed ty625911724 closed 11 months ago

ty625911724 commented 11 months ago

Hi, excellent works with impressive results.

When I read the paper, I have some questions.

In the Alogrithm 1, line 6 and 7 have a notation j, but the dicussion about it is missing. I guess $j=i \delta_S$. Is that right?

Besides, line 3 claims that $t \sim U(1,1000)$. However, it seems that t should be $n * \delta_S + \delta_T$. For example, if $\delta_S=200$ and $\delta_T=50$, the $t$ can only be 50, 250, 450, 650, 850, instead of $U(1,1000)$. Is there any problem about it?

Thanks! :)

AbnerVictor commented 11 months ago

Thank you for pointing out the problems.

Indeed, there are some typos in Algorithm.1. And yes, $j=i \delta_S$, we will fix that ASAP.

For question 2, in our implementation, we actually sample $t \sim U(1, 1000)$ first, then induce $n$ with $\delta_S$ and $\delta_T$. When $\delta_S=200$ and $\delta_T = 50$, we randomly sample $t \sim U(1, 1000)$, for example, when $t = 123$, then $s = 73$. Since $\delta_T = 200 > 73$, here we dynamically adapt $\delta_T = 73$ in practice. In some extreme case when $t < \delta_T$, we adjust $\delta_T = 0$ and $\delta_T = t$.

I hope that these would answer your questions. Feel free to ask anything about our work.

ty625911724 commented 11 months ago

Thanks a lot for your helps!

ty625911724 commented 11 months ago

Maybe there are some typos in the answer. I guess the right answer is: " When $\delta_S=200$ and $\delta_T=50$, we randomly sample $t \sim U(1,1000)$, for example, when $t=128$, then $s=73$. Since $\delta_S=200>73$, here we dynamically adapt $\delta_S=73$ in practice. "

Thanks for your patient answer.

AbnerVictor commented 11 months ago

Maybe there are some typos in the answer. I guess the right answer is: " When δS=200 and δT=50, we randomly sample t∼U(1,1000), for example, when t=128, then s=73. Since δS=200>73, here we dynamically adapt δS=73 in practice. "

Thanks for your patient answer.

Yes you’re right, sorry for the typo. Really thank you for telling.