nlp-research / bilm-tf

bilm-tf for nlp-research
0 stars 1 forks source link

시스템3 성능 실험 #5

Closed bart2001 closed 6 years ago

bart2001 commented 6 years ago

모델명: sejong_unroll_steps_40

options.json

{
 "all_clip_norm_val": 10.0,
 "batch_size": 128,
 "bidirectional": true,
 "char_cnn": {
  "activation": "relu",
  "embedding": {
   "dim": 16
  },
  "filters": [
   [
    1,
    32
   ],
   [
    2,
    32
   ],
   [
    3,
    64
   ],
   [
    4,
    128
   ]
  ],
  "max_characters_per_token": 4,
  "n_characters": 261,
  "n_highway": 2
 },
 "dropout": 0.1,
 "lstm": {
  "cell_clip": 3,
  "dim": 4096,
  "n_layers": 2,
  "proj_clip": 3,
  "projection_dim": 256,
  "use_skip_connections": true
 },
 "n_epochs": 10,
 "n_negative_samples_batch": 44,
 "n_tokens_vocab": 4488,
 "n_train_tokens": 32119740,
 "unroll_steps": 40
}
bart2001 commented 6 years ago

최종성능

Batch 20900, train_perplexity=18.38961
Total time: 12808.957140684128