Feedback for linear regression

iglee / ExplainaBoard-experiments

for keeping track of experiments done with ExplainaBoard

0 stars 0 forks source link

feedback:

add input/output data in the description, problem statement
reconsider performance metrics r2/mse- what would be a reasonable performance to expect for these models? how to baseline these performances? (refer to LangRank paper)
consider comparing predicting metrics from each data point vs. predicting mean value of the metric
correlation per bucketed features?
double check data processing
try other models like xgboost

leftover still to do:

beyond predicting bleu/mover_score/etc. from uriel/input data/sys output/reports, consider following analysis:

system by system analysis: for which systems/buckets did one system do better/worse than expected
what are the features of the language that are correlated with over/under-performance on particular phenomenon
system performs better for one metric vs. another? metric vs. metric analysis

iglee / ExplainaBoard-experiments