JAX implementation of Large Language Models. You can train GPT-2-like model with 青空文庫 (aozora bunko-clean dataset) or any other text dataset.
10
stars
2
forks
source link
Check if Token embedding indicates the nature of word embedding #8
Open
speed1313 opened 6 months ago
Calculate inner products between tokens