Open josh-segal opened 2 months ago
Initially training with hypothesis and premise sentences concatenated. Experimenting with separate embeddings currently
Splitting sequences to separate embeddings is a model architecture design choice, not hyperparameter tuning