Question on AE pretraining part

yumeng5 / TopClus

[WWW 2022] Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

Apache License 2.0

85 stars 11 forks source link

Question on AE pretraining part #3

Closed yjyoo3312 closed 1 year ago

yjyoo3312 commented 1 year ago

https://github.com/yumeng5/TopClus/blob/01e22fb73262bc45d361ec9165bdadbd929ac9a5/src/trainer.py#L79

Thank you for sharing the code. As far as I understand, this stage only trains autoencoder parameters, so It seems that

''' Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=self.args.lr) ''' should be converted to

''' Adam(filter(lambda p: p.requires_grad, model.ae.parameters()), lr=self.args.lr). '''

If it is not, please let me know.

yumeng5 commented 1 year ago

Hi,

Your understanding is correct (only the autoencoder part is trained). However, the two code versions are actually equivalent. This is because the BERT backbone parameters are frozen (requires_grad set to False as shown below) and will not be updated by the optimizer anyways.

https://github.com/yumeng5/TopClus/blob/01e22fb73262bc45d361ec9165bdadbd929ac9a5/src/model.py#L59-L60

Thanks, Yu

yjyoo3312 commented 1 year ago

Ah, I missed the line 59-60. Thank you for the fast reply!