wuyifan18 / DeepLog

Pytorch Implementation of DeepLog.
MIT License
372 stars 154 forks source link

One-hot encoding #14

Closed amineebenamor closed 5 years ago

amineebenamor commented 5 years ago

Hello @wuyifan18 , First, thank you for your implementation! I'm wondering why you're not using one-hot encoding to encode the log keys in input, as these are categorical variables. (in addition, deeplog paper says that they've done it) You are doing here an ordinal encoding by mapping each log key to an integer. Thank you for your answer!

wuyifan18 commented 5 years ago

Hello @amineebenamor, Actually, The output of the model is a probability distribution function P(m_i=k_i│w),k_i∈K(i=1,⋯,n) from final hidden state and the label is a one-hot vector As below.

criterion = nn.CrossEntropyLoss()
loss = criterion(output, label.to(device))

image

But, when you calculate the CrossEntropyLoss as below, image finally, we get loss = image So, only the number where is 1 in the one-hot vector makes function because other is 0 where calculate the CEloss. You can also refer the API docs.

amineebenamor commented 5 years ago

Thank you for your answer!

hayhan commented 2 years ago

Hello @wuyifan18 & @amineebenamor,

It looks @amineebenamor really wanted to figure out why the INPUT data is not one-hot format in the code. It is not related with the output. Using the integers (mapping to log keys) will indicate the model that big values (keys) are more important than the small values (keys). Obviously this is not what we want.

In my revised version of implementation, the input data in one-hot format gives me much better result than the data in integers on my training data set (logs from a special embedding system). E.g. The loss converges quickly and much smaller, and easily get zero false positive in training.