Closed zachguo closed 10 years ago
Here are the confusion matrixes for midterm models, naive
(the model that use first met date-in-text as prediction) and logistic regression
.
(Test dataset size is 3653)
Confusion matrix for naive model:
pre-1839 1840-1860 1861-1876 1877-1887 1888-1895 1896-1901 1902-1906 1907-1910 1911-1914 1915-1918 1919-1922 1923-present
pre-1839 267 5 2 1 2 3 0 0 0 1 0 8
1840-1860 54 228 5 1 3 3 0 0 0 0 0 7
1861-1876 45 14 268 8 2 2 1 1 1 0 0 6
1877-1887 43 7 18 253 10 4 1 1 1 0 0 4
1888-1895 41 9 10 15 243 7 3 1 0 0 0 5
1896-1901 41 8 7 7 15 228 7 4 1 2 1 9
1902-1906 41 6 8 5 5 15 236 4 1 1 0 8
1907-1910 33 7 8 4 5 8 13 208 5 1 2 6
1911-1914 30 4 7 2 3 4 7 14 165 2 0 6
1915-1918 20 3 7 3 3 4 4 6 7 241 2 16
1919-1922 19 2 4 3 4 4 6 5 7 18 240 18
1923-present 10 6 3 1 3 5 2 3 8 3 0 70
Confusion matrix for logistic regression model:
pre-1839 1840-1860 1861-1876 1877-1887 1888-1895 1896-1901 1902-1906 1907-1910 1911-1914 1915-1918 1919-1922 1923-present
pre-1839 268 6 3 1 2 3 0 1 1 1 0 5
1840-1860 38 247 6 2 3 2 0 1 0 0 0 3
1861-1876 28 14 286 9 3 2 1 1 1 0 0 4
1877-1887 24 6 23 266 11 4 2 2 1 0 0 3
1888-1895 26 7 10 15 259 8 4 1 0 0 0 4
1896-1901 26 6 8 8 17 240 8 4 2 2 2 8
1902-1906 23 6 8 3 6 17 250 6 1 1 1 7
1907-1910 16 4 7 3 4 8 15 225 6 2 3 5
1911-1914 19 3 5 2 3 4 5 19 175 3 1 6
1915-1918 11 2 5 2 1 3 3 6 9 256 7 11
1919-1922 10 1 3 2 2 3 4 6 6 16 271 5
1923-present 8 4 3 2 3 5 2 4 11 5 1 65
At first glance, both models didn't work very well in distinguishing pre-1839
from other time slices. What do you guys think?
?Hi Trevor: do you have some time tomorrow night in the lab. I wanna discuss about the maping reduce function. Sorry I am sick today and lost my voice, would better email you asking about it.
Thank you.
From: Zach Guo notifications@github.com Sent: Monday, March 31, 2014 3:58 PM To: zachguo/Z604-Project Subject: Re: [Z604-Project] Generate confusion matrixes of midterm models (#31)
Closed #31https://github.com/zachguo/Z604-Project/issues/31 via 4273e8fhttps://github.com/zachguo/Z604-Project/commit/4273e8f724089d2029ff0f6a3810302f5c4c81af.
Reply to this email directly or view it on GitHubhttps://github.com/zachguo/Z604-Project/issues/31.
What time would you like to meet?
On Wed, Apr 2, 2014 at 12:04 PM, zhhuo notifications@github.com wrote:
?Hi Trevor: do you have some time tomorrow night in the lab. I wanna discuss about the maping reduce function. Sorry I am sick today and lost my voice, would better email you asking about it.
Thank you.
From: Zach Guo notifications@github.com Sent: Monday, March 31, 2014 3:58 PM To: zachguo/Z604-Project Subject: Re: [Z604-Project] Generate confusion matrixes of midterm models (#31)
Closed #31https://github.com/zachguo/Z604-Project/issues/31 via 4273e8f< https://github.com/zachguo/Z604-Project/commit/4273e8f724089d2029ff0f6a3810302f5c4c81af>.
Reply to this email directly or view it on GitHub< https://github.com/zachguo/Z604-Project/issues/31>.
Reply to this email directly or view it on GitHubhttps://github.com/zachguo/Z604-Project/issues/31#issuecomment-39348978 .
So we can know which time slices are confusing with each other.