Closed leejason closed 5 years ago
If a smaller model is preferred for easier experiments and faster iterations, what sizes of models would you recommend? Is the following the only place to adjust? Thank you for great work ans shedding more lights.
class Encoder(torch.nn.Module): def __init__(self, num_layers=48, d_model_size=1280, num_heads=16, dff=8192, input_vocab_size=50000, rate=0.1, **kwargs)
Yeah, I think the num_layers is the only thing that needs to change. We also released a 36-layer version of the model that's in the same GCS bucket.
num_layers
Closing for now, reopen as necessary.
If a smaller model is preferred for easier experiments and faster iterations, what sizes of models would you recommend? Is the following the only place to adjust? Thank you for great work ans shedding more lights.