flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)
https://flairnlp.github.io/flair/
Other
13.83k stars 2.09k forks source link

lost decreaing but the metrics are not changing #2743

Closed myeghaneh closed 1 year ago

myeghaneh commented 2 years ago

I would be thankful if you can help me with my questions:

I have a sequence tagger , as follows;

data_folder = '../data/'
columns = {0: 'text', 1: 'pos', 2: 'BIO'}
n=3
kf = KFold(n_splits=n, random_state=1, shuffle=True)
results = []
for train_index, val_index in kf.split(DATA):
    train_df = DATA.iloc[train_index]
    test_df =DATA.iloc[val_index]
    with open('../data/trainCMTPersianF01.txt', 'w', encoding = "utf-8-sig") as f:
        for l in train_df:
            for tpl in l:
                f.write('{} {} {}'.format(tpl[0],tpl[1], tpl[2]))
                f.write('\n')   
            f.write('\n')
    with open('../data/testCMTPersianF01.txt', 'w', encoding = "utf-8-sig") as f:
        for l in test_df:
            for tpl in l:
                f.write('{} {} {}'.format(tpl[0],tpl[1], tpl[2]))
                f.write('\n')
            f.write('\n')

    corpus:Corpus = ColumnCorpus(data_folder, columns,train_file='trainCMTPersianF01.txt',test_file='testCMTPersianF01.txt')
    print(corpus)

    tag_type = 'BIO'

    tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
    print(tag_dictionary)
    farsi_embedding = WordEmbeddings('fa-crawl')

    onehot_embeddings=OneHotEmbeddings(corpus, field="pos") 
    tagger: SequenceTagger = SequenceTagger(hidden_size=512,
                                            embeddings=StackedEmbeddings([farsi_embedding, onehot_embeddings]),
                                            tag_dictionary=tag_dictionary,
                                            tag_type=tag_type 
                                            )

    trainer: ModelTrainer = ModelTrainer(tagger, corpus)

    trainer.train('resources/taggers/fa-crawl_lr_.1', embeddings_storage_mode='none',
                  learning_rate=.15,
                  mini_batch_size=32,
                  max_epochs=10)

it is supposed to use (token, POS, BIO tagging) to recognize P from C for this kind of data (which are in Farsi)


بله؛ N B-P
تفکیک N I-P
درست ADV I-P
و CON I-P
همیشه ADV I-P
زباله N I-P
هایمان N I-P
آزار N I-P
دهنده N I-P
و CON I-P
دست N I-P
و CON I-P
پا N I-P
گیر V_PR I-P
است. DELM I-P
سه N B-P
کیسه N I-P
....

با PO B-C
این DET I-C
وجود N I-C
همه PRO I-C
باید V_PR I-C
به PO I-C
طور N I-C
مساوی ADJ I-C
برای PO I-C
برنامه N I-C
های N I-C
بخش N I-C
عمومی N I-C
مشارکت N I-C
کنیم، ADJ I-C
وبرای PO B-P
این DET I-P
ما PRO I-P

I used the same approach in English and it was working. But here I used the different hyperparameters ( learning rates) and embeddings (fa or fa-crawl) but I see the metrics results are always zero,

- F1-score (micro) 0.0000
- F1-score (macro) 0.0000

By class:
C          tp: 0 - fp: 0 - fn: 13 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
P          tp: 0 - fp: 13 - fn: 52 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
2022-04-26 09:01:38,532 -----

however, I could manage to that loss decrease. you can see some of the epochs `(first 5 )

2022-04-26 09:51:26,340 Reading data from ..\data
2022-04-26 09:51:26,340 Train: ..\data\trainCMTPersianVV03.txt
2022-04-26 09:51:26,341 Dev: None
2022-04-26 09:51:26,341 Test: ..\data\testCMTPersianVV03.txt
Corpus: 23 train + 3 dev + 14 test sentences
Dictionary with 8 tags: <unk>, O, B-P, I-P, B-C, I-C, <START>, <STOP>
[b'<unk>', b'N', b'PO', b'ADJ', b'CON', b'V_PR', b'ADV', b'PRO', b'DELM', b'DET', b'AR', b'V_PA', b'INT', b'MORP']
vocabulary size of 14
2022-04-26 09:51:27,428 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:27,429 Model: "SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('fa-crawl')
    (list_embedding_1): OneHotEmbeddings(
      min_freq=3
      (embedding_layer): Embedding(14, 300)
    )
  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=600, out_features=600, bias=True)
  (rnn): LSTM(600, 256, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=512, out_features=8, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)"
2022-04-26 09:51:27,430 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:27,431 Corpus: "Corpus: 23 train + 3 dev + 14 test sentences"
2022-04-26 09:51:27,431 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:27,432 Parameters:
2022-04-26 09:51:27,433  - learning_rate: "0.1"
2022-04-26 09:51:27,434  - mini_batch_size: "32"
2022-04-26 09:51:27,434  - patience: "3"
2022-04-26 09:51:27,435  - anneal_factor: "0.5"
2022-04-26 09:51:27,436  - max_epochs: "150"
2022-04-26 09:51:27,436  - shuffle: "True"
2022-04-26 09:51:27,437  - train_with_dev: "False"
2022-04-26 09:51:27,438  - batch_growth_annealing: "False"
2022-04-26 09:51:27,438 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:27,438 Model training base path: "resources\taggers\example-ParsiV02lr02"
2022-04-26 09:51:27,439 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:27,440 Device: cpu
2022-04-26 09:51:27,440 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:27,441 Embeddings storage mode: none
2022-04-26 09:51:27,444 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:31,090 epoch 1 - iter 1/1 - loss 112.33976746 - samples/sec: 8.78 - lr: 0.100000
2022-04-26 09:51:31,090 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:31,090 EPOCH 1 done: loss 112.3398 - lr 0.1000000
2022-04-26 09:51:31,134 DEV : loss 87.24784088134766 - score 0.0
2022-04-26 09:51:31,134 BAD EPOCHS (no improvement): 0
saving best model
2022-04-26 09:51:48,371 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:51,906 epoch 2 - iter 1/1 - loss 85.84092712 - samples/sec: 9.06 - lr: 0.100000
2022-04-26 09:51:51,907 ----------------------------------------------------------------------------------------------------
2022-04-26 09:51:51,908 EPOCH 2 done: loss 85.8409 - lr 0.1000000
2022-04-26 09:51:51,950 DEV : loss 76.52790069580078 - score 0.0
2022-04-26 09:51:51,951 BAD EPOCHS (no improvement): 0
saving best model
2022-04-26 09:52:07,066 ----------------------------------------------------------------------------------------------------
2022-04-26 09:52:10,698 epoch 3 - iter 1/1 - loss 74.69778442 - samples/sec: 8.82 - lr: 0.100000
2022-04-26 09:52:10,699 ----------------------------------------------------------------------------------------------------
2022-04-26 09:52:10,700 EPOCH 3 done: loss 74.6978 - lr 0.1000000
2022-04-26 09:52:10,751 DEV : loss 61.97947311401367 - score 0.0
2022-04-26 09:52:10,753 BAD EPOCHS (no improvement): 0
saving best model
2022-04-26 09:52:23,445 ----------------------------------------------------------------------------------------------------
2022-04-26 09:52:27,055 epoch 4 - iter 1/1 - loss 64.92542267 - samples/sec: 8.87 - lr: 0.100000
2022-04-26 09:52:27,055 ----------------------------------------------------------------------------------------------------
2022-04-26 09:52:27,056 EPOCH 4 done: loss 64.9254 - lr 0.1000000
2022-04-26 09:52:27,096 DEV : loss 57.63637161254883 - score 0.0
2022-04-26 09:52:27,097 BAD EPOCHS (no improvement): 0
saving best model
2022-04-26 09:52:41,785 ----------------------------------------------------------------------------------------------------
2022-04-26 09:52:45,454 epoch 5 - iter 1/1 - loss 58.10253525 - samples/sec: 8.72 - lr: 0.100000
2022-04-26 09:52:45,455 ----------------------------------------------------------------------------------------------------
2022-04-26 09:52:45,455 EPOCH 5 done: loss 58.1025 - lr 0.1000000
2022-04-26 09:52:45,496 DEV : loss 46.229339599609375 - score 0.0
2022-04-26 09:52:45,496 BAD EPOCHS (no improvement): 0
saving best model
2022-04-26 09:53:05,509 ------------------------------------------------------

or

2022-04-26 11:13:59,745 epoch 75 - iter 1/1 - loss 32.36521530 - samples/sec: 9.14 - lr: 0.100000
2022-04-26 11:13:59,745 ----------------------------------------------------------------------------------------------------
2022-04-26 11:13:59,746 EPOCH 75 done: loss 32.3652 - lr 0.1000000
2022-04-26 11:13:59,787 DEV : loss 19.553163528442383 - score 0.0
2022-04-26 11:13:59,788 BAD EPOCHS (no improvement): 0
saving best model
2022-04-26 11:14:16,013 ----------------------------------------------------------------------------------------------------
2022-04-26 11:14:19,419 epoch 76 - iter 1/1 - loss 22.18959618 - samples/sec: 9.40 - lr: 0.100000
2022-04-26 11:14:19,420 ----------------------------------------------------------------------------------------------------
2022-04-26 11:14:19,420 EPOCH 76 done: loss 22.1896 - lr 0.1000000
2022-04-26 11:14:19,465 DEV : loss 32.55488204956055 - score 0.0
2022-04-26 11:14:19,466 BAD EPOCHS (no improvement): 1
2022-04-26 11:14:19,467 ----------------------------------------------------------------------------------------------------
2022-04-26 11:14:22,622 epoch 77 - iter 1/1 - loss 32.34597397 - samples/sec: 10.15 - lr: 0.100000
2022-04-26 11:14:22,623 ----------------------------------------------------------------------------------------------------
2022-04-26 11:14:22,624 EPOCH 77 done: loss 32.3460 - lr 0.1000000
2022-04-26 11:14:22,661 DEV : loss 19.384613037109375 - score 0.0
2022-04-26 11:14:22,662 BAD EPOCHS (no improvement): 0
saving best model
2022-04-26 11:14:57,656 ----------------------------------------------------------------------------------------------------
2022-04-26 11:15:00,994 epoch 78 - iter 1/1 - loss 22.02996635 - samples/sec: 9.59 - lr: 0.100000
2022-04-26 11:15:00,995 ----------------------------------------------------------------------------------------------------
2022-04-26 11:15:00,995 EPOCH 78 done: loss 22.0300 - lr 0.1000000
2022-04-26 11:15:01,046 DEV : loss 32.37981033325195 - score 0.0
2022-04-26 11:15:01,046 BAD EPOCHS (no improvement): 1
2022-04-26 11:15:01,048 ----------------------------------------------------------------------------------------------------
2022-04-26 11:15:04,331 epoch 79 - iter 1/1 - loss 31.91044426 - samples/sec: 9.75 - lr: 0.100000
2022-04-26 11:15:04,332 ----------------------------------------------------------------------------------------------------
2022-04-26 11:15:04,332 EPOCH 79 done: loss 31.9104 - lr 0.1000000
2022-04-26 11:15:04,374 DEV : loss 19.234619140625 - score 0.0
2022-04-26 11:15:04,375 BAD EPOCHS (no improvement): 0
saving best model

I have the feeling something is wrong. (my guess is on the inconsistency of POS tags with embedding or BIO tags ) o or what would you suggest to solve it . On the other hand, this model works for the English (with cross-validation) number of my corpus in Persian is half of the English, that might be the reason? I think even so that is not the best model but it should at least give some non-zero metric. I appreciate your support.

myeghaneh commented 2 years ago

an update, with changing some hyperparameter (increasing the number of the hidden layers) I could be able to move out the number from zero, but still F1 SCORE is not good. Any suggestion would be very welcome

Results:
- F1-score (micro) 0.0741
- F1-score (macro) 0.0545

By class:
C          tp: 0 - fp: 13 - fn: 13 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
P          tp: 3 - fp: 0 - fn: 49 - precision: 1.0000 - recall: 0.0577 - f1-score: 0.1091
​

here is my model :

data_folder = '../data/'
columns = {0: 'text', 1: 'pos', 2: 'BIO'}
n=3
kf = KFold(n_splits=n, random_state=1, shuffle=True)
results = []
for train_index, val_index in kf.split(DATA):
    train_df = DATA.iloc[train_index]
    test_df =DATA.iloc[val_index]
    with open('../data/trainCMTPersianF01.txt', 'w', encoding = "utf-8-sig") as f:
        for l in train_df:
            for tpl in l:
                f.write('{} {} {}'.format(tpl[0],tpl[1], tpl[2]))
                f.write('\n')   
            f.write('\n')
    with open('../data/testCMTPersianF01.txt', 'w', encoding = "utf-8-sig") as f:
        for l in test_df:
            for tpl in l:
                f.write('{} {} {}'.format(tpl[0],tpl[1], tpl[2]))
                f.write('\n')
            f.write('\n')

    corpus:Corpus = ColumnCorpus(data_folder, columns,train_file='trainCMTPersianF01.txt',test_file='testCMTPersianF01.txt')
    print(corpus)

    tag_type = 'BIO'

    tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
    print(tag_dictionary)
    farsi_embedding = WordEmbeddings('fa-crawl')

    onehot_embeddings=OneHotEmbeddings(corpus, field="pos") 
    tagger: SequenceTagger = SequenceTagger(hidden_size=1024,
                                            embeddings=StackedEmbeddings([farsi_embedding, onehot_embeddings]),
                                            tag_dictionary=tag_dictionary,
                                            tag_type=tag_type 
                                            )

    trainer: ModelTrainer = ModelTrainer(tagger, corpus)

    # 7. start training
    trainer.train('resources/taggers/fa-crawl_lr_.1', embeddings_storage_mode='none',
                  learning_rate=.1,
                  mini_batch_size=32,
                  max_epochs=300)

here are the firsts epochs:


  )
  (word_dropout): WordDropout(p=0.05)
  (locked_dropout): LockedDropout(p=0.5)
  (embedding2nn): Linear(in_features=600, out_features=600, bias=True)
  (rnn): LSTM(600, 1024, batch_first=True, bidirectional=True)
  (linear): Linear(in_features=2048, out_features=8, bias=True)
  (beta): 1.0
  (weights): None
  (weight_tensor) None
)"
2022-04-28 08:40:36,130 ----------------------------------------------------------------------------------------------------
2022-04-28 08:40:36,130 Corpus: "Corpus: 23 train + 3 dev + 14 test sentences"
2022-04-28 08:40:36,131 ----------------------------------------------------------------------------------------------------
2022-04-28 08:40:36,132 Parameters:
2022-04-28 08:40:36,133  - learning_rate: "0.1"
2022-04-28 08:40:36,133  - mini_batch_size: "32"
2022-04-28 08:40:36,134  - patience: "3"
2022-04-28 08:40:36,134  - anneal_factor: "0.5"
2022-04-28 08:40:36,135  - max_epochs: "300"
2022-04-28 08:40:36,136  - shuffle: "True"
2022-04-28 08:40:36,137  - train_with_dev: "False"
2022-04-28 08:40:36,137  - batch_growth_annealing: "False"
2022-04-28 08:40:36,138 ----------------------------------------------------------------------------------------------------
2022-04-28 08:40:36,138 Model training base path: "resources\taggers\fa-crawl_lr_.1"
2022-04-28 08:40:36,139 ----------------------------------------------------------------------------------------------------
2022-04-28 08:40:36,140 Device: cpu
2022-04-28 08:40:36,141 ----------------------------------------------------------------------------------------------------
2022-04-28 08:40:36,142 Embeddings storage mode: none
2022-04-28 08:40:36,145 ----------------------------------------------------------------------------------------------------
2022-04-28 08:40:45,352 epoch 1 - iter 1/1 - loss 200.49084473 - samples/sec: 3.48 - lr: 0.100000
2022-04-28 08:40:45,352 ----------------------------------------------------------------------------------------------------
2022-04-28 08:40:45,353 EPOCH 1 done: loss 200.4908 - lr 0.1000000
2022-04-28 08:40:45,672 DEV : loss 154.44911193847656 - score 0.0
2022-04-28 08:40:45,673 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 08:40:55,980 ----------------------------------------------------------------------------------------------------
2022-04-28 08:41:04,973 epoch 2 - iter 1/1 - loss 152.66317749 - samples/sec: 3.56 - lr: 0.100000
2022-04-28 08:41:04,974 ----------------------------------------------------------------------------------------------------
2022-04-28 08:41:04,974 EPOCH 2 done: loss 152.6632 - lr 0.1000000
2022-04-28 08:41:05,238 DEV : loss 114.63671875 - score 0.0
2022-04-28 08:41:05,239 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 08:41:15,191 ----------------------------------------------------------------------------------------------------
2022-04-28 08:41:24,484 epoch 3 - iter 1/1 - loss 110.99184418 - samples/sec: 3.44 - lr: 0.100000
2022-04-28 08:41:24,485 ----------------------------------------------------------------------------------------------------
2022-04-28 08:41:24,486 EPOCH 3 done: loss 110.9918 - lr 0.1000000
2022-04-28 08:41:24,760 DEV : loss 84.06620025634766 - score 0.0
2022-04-28 08:41:24,760 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 08:41:39,668 ----------------------------------------------------------------------------------------------------
2022-04-28 08:41:49,653 epoch 4 - iter 1/1 - loss 78.02548218 - samples/sec: 3.21 - lr: 0.100000
2022-04-28 08:41:49,654 ----------------------------------------------------------------------------------------------------
2022-04-28 08:41:49,655 EPOCH 4 done: loss 78.0255 - lr 0.1000000
2022-04-28 08:41:49,924 DEV : loss 68.85929107666016 - score 0.0
2022-04-28 08:41:49,925 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 08:42:05,855 ----------------------------------------------------------------------------------------------------
2022-04-28 08:42:15,385 epoch 5 - iter 1/1 - loss 61.02209091 - samples/sec: 3.36 - lr: 0.100000
2022-04-28 08:42:15,386 ----------------------------------------------------------------------------------------------------
2022-04-28 08:42:15,387 EPOCH 5 done: loss 61.0221 - lr 0.1000000
2022-04-28 08:42:15,671 DEV : loss 55.140933990478516 - score 0.0
2022-04-28 08:42:15,672 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 08:42:41,768 ----------------------------------------------------------------------------------------------------
2022-04-28 08:42:51,304 epoch 6 - iter 1/1 - loss 50.04317474 - samples/sec: 3.36 - lr: 0.100000
2022-04-28 08:42:51,304 ----------------------------------------------------------------------------------------------------
2022-04-28 08:42:51,305 EPOCH 6 done: loss 50.0432 - lr 0.1000000
2022-04-28 08:42:51,592 DEV : loss 45.13371658325195 - score 0.0
2022-04-28 08:42:51,592 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 08:43:15,581 ----------------------------------------------------------------------------------------------------
2022-04-28 08:43:24,683 epoch 7 - iter 1/1 - loss 42.07927704 - samples/sec: 3.52 - lr: 0.100000
2022-04-28 08:43:24,683 ----------------------------------------------------------------------------------------------------
2022-04-28 08:43:24,684 EPOCH 7 done: loss 42.0793 - lr 0.1000000
2022-04-28 08:43:25,041 DEV : loss 48.79399490356445 - score 0.0
2022-04-28 08:43:25,041 BAD EPOCHS (no improvement): 1
2022-04-28 08:43:25,042 ----------------------------------------------------------------------------------------------------
2022-0

last epochs:"

here is the last epochs

2022-04-28 12:46:51,893 EPOCH 296 done: loss 23.5334 - lr 0.1000000
2022-04-28 12:46:52,132 DEV : loss 15.599080085754395 - score 0.0
2022-04-28 12:46:52,133 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 12:47:04,179 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:12,810 epoch 297 - iter 1/1 - loss 14.92063141 - samples/sec: 3.71 - lr: 0.100000
2022-04-28 12:47:12,811 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:12,812 EPOCH 297 done: loss 14.9206 - lr 0.1000000
2022-04-28 12:47:13,055 DEV : loss 25.045541763305664 - score 0.0
2022-04-28 12:47:13,056 BAD EPOCHS (no improvement): 1
2022-04-28 12:47:13,057 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:21,433 epoch 298 - iter 1/1 - loss 23.38024330 - samples/sec: 3.82 - lr: 0.100000
2022-04-28 12:47:21,433 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:21,434 EPOCH 298 done: loss 23.3802 - lr 0.1000000
2022-04-28 12:47:21,670 DEV : loss 15.59068775177002 - score 0.0
2022-04-28 12:47:21,671 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 12:47:33,563 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:42,191 epoch 299 - iter 1/1 - loss 14.68082809 - samples/sec: 3.71 - lr: 0.100000
2022-04-28 12:47:42,192 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:42,192 EPOCH 299 done: loss 14.6808 - lr 0.1000000
2022-04-28 12:47:42,428 DEV : loss 24.841140747070312 - score 0.0
2022-04-28 12:47:42,428 BAD EPOCHS (no improvement): 1
2022-04-28 12:47:42,430 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:50,791 epoch 300 - iter 1/1 - loss 23.45882988 - samples/sec: 3.83 - lr: 0.100000
2022-04-28 12:47:50,791 ----------------------------------------------------------------------------------------------------
2022-04-28 12:47:50,792 EPOCH 300 done: loss 23.4588 - lr 0.1000000
2022-04-28 12:47:51,034 DEV : loss 15.518315315246582 - score 0.0
2022-04-28 12:47:51,034 BAD EPOCHS (no improvement): 0
saving best model
2022-04-28 12:48:24,854 ----------------------------------------------------------------------------------------------------
2022-04-28 12:48:24,854 Testing using best model ...
2022-04-28 12:48:24,855 loading file resources\taggers\fa-crawl_lr_.1\best-model.pt
2022-04-28 12:48:28,122 0.1875  0.0462  0.0741
2022-04-28 12:48:28,123 
Results:
- F1-score (micro) 0.0741
- F1-score (macro) 0.0545

By class:
C          tp: 0 - fp: 13 - fn: 13 - precision: 0.0000 - recall: 0.0000 - f1-score: 0.0000
P          tp: 3 - fp: 0 - fn: 49 - precision: 1.0000 - recall: 0.0577 - f1-score: 0.1091
2022-04-28 12:48:28,124 ----------------------------------------------------------------------------------------------------
​
myeghaneh commented 2 years ago

I could solve the issue by using these networks and increasing the hidden layer and...

onehot_embeddings=OneHotEmbeddings(corpus, field="pos") 
    embedding_types: List[TokenEmbeddings] = [
    WordEmbeddings('fa-crawl'),
    FlairEmbeddings('fa-forward'),
    FlairEmbeddings('fa-backward'),
     ]
    embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)
    tagger: SequenceTagger = SequenceTagger(hidden_size=2048,
                                            embeddings=embeddings,
                                            tag_dictionary=tag_dictionary,
                                            tag_type=tag_type,use_crf=True 
                                            )

    trainer: ModelTrainer = ModelTrainer(tagger, corpus)

    # 7. start training
    trainer.train('resources/taggers/PersianSeqTaggerFACrawllr2', embeddings_storage_mode='none',
                  learning_rate=.2,
                  mini_batch_size=32,
                  max_epochs=150)

result of one of the splits of cross-validation :)

- F1-score (micro) 0.6718
- F1-score (macro) 0.6347

By class:
C          - precision: 0.7200 - recall: 0.4865 - f1-score: 0.5806
P           - precision: 0.6706 - recall: 0.7081 - f1-score: 0.6888
2022-05-06 05:26:38,210 ------------------------------------------------------
alanakbik commented 2 years ago

Great! You should also try decreasing the mini-batch size - if I read your output correctly you have only 20 training examples and a mini-batch size of 32 set. This means that each epoch you only do one update step which is very little. Try setting your mini-batch size to 2 or 4 to get more updates per epoch.

myeghaneh commented 2 years ago

Greeting from HU Berlin :) sure, I'll do that That was for the beginning (to test ), I have now

"Corpus: 67 train + 7 dev + 38 test sentences"

I was planning to do that, thank you Alan, looking fwd to seeing you one the day in HU

alanakbik commented 2 years ago

Ah greetings back - did not realize you were at HU ;)

JohnXXDoe commented 2 years ago

Hey @myeghaneh, thanks for the follow ups, could you let me know what GPU you are using for current training environment or is it on your CPU?

myeghaneh commented 2 years ago

Hey @myeghaneh, thanks for the follow ups, could you let me know what GPU you are using for current training environment or is it on your CPU?

@JohnXXDoe sure, unfortunately, it is CPU; Intel, Core(TM) i7, RAM 32 GB

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.