lucidrains / routing-transformer

Fully featured implementation of Routing Transformer
MIT License
282 stars 29 forks source link

Report an error in the training example of enwik8_simple #16

Closed JunZhan2000 closed 3 years ago

JunZhan2000 commented 3 years ago

I think that when the model is initialized in line 39, a local_attn_window_size parameter is missing. If there is only n_local_attn_heads parameter but no local_attn_window_size parameter, an error will be reported. In addition, I suggest that the author re-run other examples, it seems that there are still some problems. I believe this will make this job more perfect.

lucidrains commented 3 years ago

@guokr233 oops! its been fixed, sorry about that!