Can you tell why 4 fully connected layers are used to output features?

Elsaam2y / DINet_optimized

An optimized pipeline for DINet reducing inference latency for up to 60% 🚀. Kudos for the authors of the original repo for this amazing work.

104 stars 17 forks source link

Can you tell why 4 fully connected layers are used to output features? #6

Closed pgyilun closed 1 year ago

pgyilun commented 1 year ago

class _Wav2vecDS(nn.Module):

def __init__(self, input_dim, hidden_dim):
    super(_Wav2vecDS, self).__init__()
    self.fc1 = nn.Linear(input_dim, hidden_dim)
    self.fc2 = nn.Linear(hidden_dim, hidden_dim)
    self.fc3 = nn.Linear(hidden_dim, hidden_dim)
    self.fc4 = nn.Linear(hidden_dim, input_dim)

def forward(self, x):
    x = torch.relu(self.fc1(x))
    x = torch.relu(self.fc2(x))
    x = torch.relu(self.fc3(x))
    x = self.fc4(x)
    return x

Elsaam2y commented 1 year ago

This is pure experimental. You can try changing the architecture and make it more complex, or even simpler and monitor the losses. I tried few architecture, since the problem here is not very complex, and this worked fine for me. But feel free to open a PR if you have any other idea. I will share the training script for comparison.