Closed akolonin closed 5 years ago
Simple math considerations:
If Average sentence parse = 68.81% ~ number_of_parsed_links / number_of_links_in_test_corpus:
Average_sentence_parse Precision ~ 0.68810.7726 ~ 0.53.
This means that we have correctly learned 53 links of every 100 links present in the test corpus, all 100 presumed correct.
Why recall is 63.77% and not 53%?
@OlegBaskov - clan we close this in favor of #200
"Alternative" F1 estimation -- Alternative_F1_for_ALE_ILE%20clustering_2019-04-12.html
Understood. Continuation in #200
There are doubts that PA/PQ/Precision/Recall are matching given the results: http://langlearn.singularitynet.io/data/clustering_2019/cALEd-500-GCB-LG-E-noQuotes-S94-2019-04-02/GCB_LG-E-noQuotes_cALWEd_no-gen_mwc=2/GC_LGEnglish_noQuotes_fullyParsed.ull.stat Total sentences parsed in full: 44.80% Total sentences not parsed at all: 12.09% Average sentence parse: 68.81% Total sentences: 68826.00 Skipped sentences: 0.00 Parse time: 0h 25m 44s 271ms
Parse quality: 63.78%
Average total links: 10.46 Average ignored links: 0.00 Average missing links: 4.00 Average extra links: 0.37
Recall: 63.78% Precision: 77.26% F1: 0.70
Total sentences: 68826.00
Need to create unit test on that with input corpus of 5 input sentences 6 words and 5 links each total 25 expected links agains test corpus of only 4 parsed sentences with 20 links (PA=80%) with 1 wrong link in the every sentence (Precision = 15/20 = 0.75, Recall = 15/25 = 0.6, F1 = 0.66666667) Need to add statistics to the GT output: Expected links: Parsed links: Matched links:
@OlegBaskov @alexei-gl - please check the math for the above.