Closed adws2 closed 1 month ago
@TaoSunVoyage could you please take a look?
@adws2 for this part, i followed the implementation of the original polytropon repository. https://github.com/McGill-NLP/polytropon/blob/d567ea838cb8b76b75c5e3135ac1a132ec77ebbc/src/polytropon/adapters.py#L159
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
I don't understand the implementation of the Gumbel-Sigmoid part.
How can Eq.2 in poly paper become below?
skill_logits = RelaxedBernoulli(temperature=1., logits=skill_logits).rsample()
Also, isn't z_{ij} binary? It doesn't seem to be binary in peft code.