JAX implementation of Large Language Models. You can train GPT-2-like model with 青空文庫 (aozora bunko-clean dataset) or any other text dataset.
9
stars
1
forks
source link
Check if Token embedding indicates the nature of word embedding #8
Open
speed1313 opened 4 months ago
Calculate inner products between tokens