Classify pipeline - Githubissues

What has been done

Further work on classify pipeline:

Pipelines for TF-IDF, XGBOOST on all PC components, XGBOOST on the top n (3) PC components
Metrics extraction pipeline has been updated to be able to compute perplexity and entropy manually (but not yet run, see #62)
Summarising tables have been created with prelim precision, recall scores etc. (e.g., clf_results/clf_results/dailydialog_temp1/all_results.html)

Note that everything has been run on the datasets with temperature 1 but it has been setup in a way where we can easily run it on 1.5 as well.

Decisions to be made

Basically whether we need to make any changes to PCA (following what is described in #65) and which model to use for computing perplexity/entropy outside of textdescriptives #62

rbroc / echo

Classify pipeline #66

What has been done

Decisions to be made