Closed jongjyh closed 1 year ago
Hi,
Both PPLM and ITI are working on the innate representation space. However, PPLM gets its gradient from SGD on the text space with a text classifier; and ITI has a learnt and fixed direction $\theta$ for this.
$\theta$ is normalized to be in a unit sphere, therefore we use $\alpha$ to calibrate the strength of such interventions.
Unlike PPLM, ITI is one-pass update.
Hi,
excellent work! I'd like to ask the intuition of equation of ITI (i.e equation(2) on paper). Does this have any connection with PPLM where updates the hidden state accoding to its score( gradient of log likelihood)? why it uses $\sigma*\theta$ to update? I couldn't find explicit explaination on paper.
Also, does ITI need iterative updation like PPLM?
Thanks in advance. :)