Open getChan opened 5 years ago
sampling | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
RandomOverSample | 40 | 93 | 37 |
RandomUnderSample | 98 | 80 | 33 |
SMOTE | 61 | 78 | 22 |
Layer | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
oversample | |||
c32-c32-pool | 40 | 93 | 37 |
c64-c32-pool | 44 | 92 | 37 |
c64-c64-pool | 39 | 92 | 37 |
c64-c32-c32-pool | 41 | 93 | 39 |
c64-c32(6개)-pool | 39 | 94 | 39 |
undersample | |||
c64-c32-pool | 95 | 77 | 30 |
c64-c32-c32-pool | 95 | 77 | 29 |
c64-c32(6개)-pool | 78 | 87 | 37 |
c64-c32(6개)-pool 기준
Epoch | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
5 | 92 | 83 | 34 |
10 | 89 | 85 | 37 |
20 | 95 | 82 | 35 |
Epoch | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
6 | 82 | 82 | 82 |
Random Swap( RS ) 기법을 활용한 Data Augmentation
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 64,input_length=10), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu,), keras.layers.Dropout(0.5), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(256, 3, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Dense(64, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary()
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 64,input_length=10), keras.layers.Conv1D(32, 5, padding="same", activation=tf.nn.relu,), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Conv1D(256, 3, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Dense(64, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary()
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 64,input_length=10), keras.layers.Conv1D(32, 5, padding="same", activation=tf.nn.relu,), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(256, 3, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Dense(64, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary()
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 64,input_length=10), keras.layers.Conv1D(64, 5, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=5,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 5, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Dense(64, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary()
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 64,input_length=10), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu,), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2, padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Flatten(), keras.layers.Dense(128, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary()
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 16,input_length=10), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu,), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.MaxPool1D(pool_size=2,padding="same"), keras.layers.Dropout(0.5), keras.layers.Conv1D(256, 3, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Dense(128, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary()
Epoch | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
5 | 34 | 93 | 33 |
5 | 35 | 93 | 31 |
5 | 38 | 93 | 35 |
5 | 37 | 93 | 34 |
5 | 41 | 92 | 32 |
5 | 35 | 94 | 35 |
Random Swap( RS ) 기법을 활용한 Data Augmentation
모델별
train: test 5:5 split-> train (9:1) test (9:1) ->train 오버샘플링 5:5-> train만 argumentation
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 64,input_length=10), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu,), keras.layers.Dropout(0.5), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(256, 3, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Dense(64, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary()
Epoch Recall % Accuracy % F1-Score % 6 34 93 33
test data 또한 5:5 로 undersampling 된 데이터로 실험해보세요! 기곤쌤이 그것도 맞는 테스트라고 하십니다
Random Swap( RS ) 기법을 활용한 Data Augmentation
모델별
train: test 5:5 split-> train (9:1) test (9:1) ->train 오버샘플링 5:5-> train만 argumentation
model_up = keras.Sequential([ keras.layers.Embedding(vocab_size, 64,input_length=10), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu,), keras.layers.Dropout(0.5), keras.layers.Conv1D(32, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(64, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(128, 3, padding="same", activation=tf.nn.relu), keras.layers.Dropout(0.5), keras.layers.Conv1D(256, 3, padding="same", activation=tf.nn.relu), keras.layers.GlobalMaxPool1D(), keras.layers.Dropout(0.5), keras.layers.Dense(64, activation=tf.nn.relu), keras.layers.Dense(2, activation=tf.nn.sigmoid) ]) model_up.summary() Epoch Recall % Accuracy % F1-Score % 6 34 93 33 같은 모델로 다른데이터( 데이터증식이 좋다) 6 34 93 35 데이터증식 + 라벨링 재고침 5 37 92 34 3 38 92 37 2 38 94 38 test data 또한 5:5 로 undersampling 된 데이터로 실험해보세요! 기곤쌤이 그것도 맞는 테스트라고 하십니다
ok !! 땡큐
Epoch | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
40 | 67 | 76 | 74 |
model | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
CNN_Okt | 71 | 75 | 73 |
CNN_BPE | 75 | 75 | 75 |
Attention_Okt | 82 | 75 | 77 |
Attention_BPE | 71 | 73 | 71 |
model | Recall % | Accuracy % | F1-Score % |
---|---|---|---|
Okt | 88 | 91 | 91 |
Nkt | 90 | 94 | 92 |
BPE | 93 | 93 | 93 |
model | Accuracy % | Recall % | F1-Score % |
---|---|---|---|
attention model | |||
95:5 | 90 | 40 | 29 |
5:5 | 66 | 40 | 54 |
우리 모델은 분리된 / 순서바뀐 자소로 이루어진 욕을 룰베이스보다 훨씬 잘 잡는다!!
--------- | Accuracy % | Recall % | F1-Score % |
---|---|---|---|
모델 | 89 | 90 | 89 |
기존 | 77 | 56 | 71 |
2019-08-19