kk7nc / HDLTex

HDLTex: Hierarchical Deep Learning for Text Classification
https://hdltex.readthedocs.io/
MIT License
265 stars 68 forks source link

Evaluation procedure used in HDLTex #4

Closed gourango01 closed 6 years ago

gourango01 commented 6 years ago

During the training of the HDLTex classification accuracy is printed after every epoch for Level 1 and Level2. But i think which is different from the evaluation procedure mentioned in the equation 21 of the paper and used for reporting the results in the paper. Can you please explain the equation 21 in more detail?

kk7nc commented 6 years ago

The evaluation of the HDLTex depends on the both levels as you can see in the Eq 21 (e.g if we have 2 labels L1 and L2 and in each level we have 2 classes in L1 and in L2 (3 classes and and 4 classes) which means we have datasets whit 7 classes) so the accuracy is calculated with joint probability of L1 and L2.

chengdezhi commented 6 years ago

I am also confused with the equation 21. Could you give more detailed description for the symbols in equation 21, like Acc, n, and so on. Thanks.

chengdezhi commented 6 years ago

Is baseline classification accuracy also computed the same as equation 21?

kk7nc commented 6 years ago

Thank you for your question, Let me explain by a example : if you have two level of labels (L1 and L2) such as L1 {Computer Science(CS) and Electrical Engineering(EE) } two classes in L1, and in each classes in L1 include 2 classes such as: L2 of CS: {Artificial Intelligence(AI) and Machine learning(ML)} L2 of EE: {Signal Processing(SP) and Microelectronic Engineering (ME)}

Then we have 4 classes {AI, ML, SP, and ME}, so we generate three models one model in parent level and two models in children level, as follows:

L1: CS vs. EE

L2: {AI vs. ML} and {SP vs. ME}

Final accuracy is calculated as follows:

e.g if you have 100 documents : 60 belongs to CS (30 for AI and 30 for ML) 40 belongs to EE (25 for SP and 15 for ME)

In L1 (CS vs. EE) 94 out of 100 documents categorized in correct class (e.g. 58 to CS and 42 to EE ) : %94 ACC

in L2 (AI vs. ML): 49 out of 58 documents categorized in correct class: ACC =$84.4 in L2 (SP vs. ME): 40 out of 42 documents categorized in correct class: ACC= $95.2

final accuracy: 0.94 *((49+40)/100) = 0.801 so the final accuracy is %83.66

But if we have 3 level of more we should with calculating ACC in parents then in children Level as follows:

ACC of child's levels such sum(ACC of children/number of children)* sum(correct classes of all children levels )/ N

NOTE: the final accuracy is joint probability of L1 and all (L2s) due all misclassified in L1 will be incurect in L2s (e.g if a document is CS->ML (0,1) and misclassified L1 to EE the in L2 classified in ME(1,1)) the second classes will be absolutely incorrect regardless of results EE->ME or EE->SP

for your other question baseline classification accuracies are not computed as same as equation 21 because they don't have multi level models as hierarchy.

Please let me know if you have any other questions.

chengdezhi commented 6 years ago

Thanks for your very clear explanations and examples. It helps me a lot.

So your baseline classification methods employ just one model to distinguish classes in the last level ( like L2 {AI vs. ML} and {SP vs. ME} in your example above), and the accuracy is computed as the same as
general classification task without hierarchy. I understand that correctly ?

kk7nc commented 6 years ago

Yes baseline does not have mid level labels so they only employ just single level model such as {AI, ML, SP, va. ME}

gourango01 commented 6 years ago

I have few questions about the training of HDLTex .

  1. Have you used same random seed (i.e. 7 used in Data Helper.py) to split the available datasets (WOS_5736, WOS_11967, WOS_46985) into train and test ?
  2. How you have done the hyper-parameter tunning ?
kk7nc commented 6 years ago

Have you used same random seed (i.e. 7 used in Data Helper.py) to split the available datasets (WOS_5736, WOS_11967, WOS_46985) into train and test ? yes but you can use other seeds

about hyperparameter tuning, we shows that HDLTex as Hierarchical Deep Learning models for text classification for Hierarchical datasets significantly improve accuracy, and it does not depends on hyperparameters so if you use some hyperparameter tuning, your results will be much better.

MarcosFP97 commented 2 years ago

Hi guys,

Anyone knows how many training epochs were used in the original study? To obtain the same reported results.

Best, Marcos