moon-hotel / TransformerTranslation

A Transformer Framework Based Translation Task
135 stars 37 forks source link

为什么源代码里学习率是0,文档里又不是?为什么删掉了? #10

Closed gzzyyxh closed 7 months ago

gzzyyxh commented 7 months ago

在这个位置:

    loss_fn = torch.nn.CrossEntropyLoss(ignore_index=data_loader.PAD_IDX)

    optimizer = torch.optim.Adam(translation_model.parameters(),
                                 lr=0.,
                                 betas=(config.beta1, config.beta2), eps=config.epsilon)
Lix1993 commented 7 months ago

https://github.com/moon-hotel/TransformerTranslation/blob/b265526093d4c96fc859dde0c5cfebda94dd1563/train.py#L106

源码里就是用的这个,不是0,你给的只是初始化的值