Closed jichaoz closed 4 years ago
Add a public API in Foreshadow to get a training data summary in a DataFrame. Also fixed or suppressed a couple of warnings. The data summary will look like the following:
Pclass Sex SibSp Parch Cabin Embarked PassengerId Age Ticket \ intent Categorical Categorical Categorical Categorical Categorical Categorical Droppable Numeric Numeric count 712 712 712 712 712 712 712 712 712 nan_pct 0 0 0 0 77.6685 0.280899 0 19.6629 25.8427 unique 3 2 7 7 117 3 712 83 425 #1_value 3 55.90% male 65.59% 0 67.98% 0 75.98% C23 C25 C27 0.56% S 73.74% 891 0.14% 24.0 3.65% 1601.0 0.84% #2_value 1 78.79% female 100.00% 1 91.01% 1 89.19% B96 B98 0.98% C 91.29% 277 0.28% 22.0 6.88% 347082.0 1.69% #3_value 2 100.00% 2 94.24% 2 98.60% G6 1.40% Q 99.72% 308 0.42% 25.0 9.83% 3101295.0 2.39% #4_value 4 96.49% 5 99.02% C22 C26 1.83% 306 0.56% 28.0 12.78% 19950.0 2.95% #5_value 3 98.31% 4 99.44% F2 2.25% 305 0.70% 18.0 15.59% 113781.0 3.51% #6_value 8 99.30% 3 99.86% E101 2.67% 304 0.84% 30.0 18.40% 349909.0 4.07% #7_value 5 100.00% 6 100.00% B28 2.95% 303 0.98% 21.0 21.07% 382652.0 4.63% #8_value D26 3.23% 302 1.12% 19.0 23.74% 29106.0 5.06% #9_value B35 3.51% 299 1.26% 29.0 25.98% 4133.0 5.48% #10_value C78 3.79% 298 1.40% 27.0 28.09% 110152.0 5.90% invalid_pct 0 0 mean 29.4988 274769 std 14.5001 505561 min 0.42 695 25% 21 27703.5 50% 28 236852 75% 38 348123 max 80 3.1013e+06 5_outliers [80.0, 74.0] [3101298.0, 3101296.0, 3101295.0, 3101295.0, 3...
Description
Add a public API in Foreshadow to get a training data summary in a DataFrame. Also fixed or suppressed a couple of warnings. The data summary will look like the following: