weight prediction experiments

alanyuchenhou / elephant

MIT License

4 stars 5 forks source link

weight prediction experiments #36

Closed alanyuchenhou closed 7 years ago

alanyuchenhou commented 7 years ago

experiment settings

testing set size: 20%
metric: MSE (mean squared error)
experiment results

data set	WSBM (weighted stochastic block model)	Model R
airport	0.0486	0.0136
collaboration	0.0407	0.0352
congress	0.0571	0.0560
forum	0.0726	0.0326

alanyuchenhou commented 7 years ago

The experiment needs to include multiple trials for each data set with low variance in accuracy in order to make a strong claim that Model R is better than WSBM and its variants.

alanyuchenhou commented 7 years ago

use t-test to compare the models to make a stronger claim

alanyuchenhou commented 7 years ago

found the scipy implementation of t-test scipy.stats.ttest_ind_from_stats confirmed this is Student's t-test from the docs' reference to the wiki page of Student's t-test decided to choose equal_var=False (i.e., does not assume equal population variance)

ghost commented 7 years ago

To be specific, this is a one-tailed, paired t-test. One-tailed, because we want to know if method 1 is better than method 2. A two-tailed would be used if we just want to know if the methods are different. Paired means that the data used for each trial is the same for method 1 and method 2. When you compute t-statistic, make sure you use the correct formula (paired), and compare to the right threshold (one-tailed).

alanyuchenhou commented 7 years ago

Got it. I noticed the difference. I'll keep working on the paired one; but meanwhile, let me also do the unpaired one, because it's very cost-efficient as I've got all the data it needs already. I think it still can help me make a stronger claim if pvalue is small enough, even if it's not as strong as the paired one.

alanyuchenhou / elephant

weight prediction experiments #36

experiment settings

experiment results