jeff1evesque / ist-687

Syracuse IST687 final project with Jesse Warren (team member)
2 stars 0 forks source link

Create basic Access vs page views visualization(s) #24

Closed jeff1evesque closed 6 years ago

jeff1evesque commented 6 years ago

We will create visualization(s) between our Access column, and the page views, within our basic.R.

jeff1evesque commented 6 years ago

We attempted to create a boxplot:

## boxplot: device vs page views
ggplot(meltdf, aes(x=Access, y=as.numeric(value), fill=Access)) +
    geom_boxplot()

However, our visualization is only showing points:

boxplot-device-page-views

jeff1evesque commented 6 years ago

We've verified that all meltdf$value is numeric:

> all.is.numeric(meltdf$value)
[1] TRUE
jeff1evesque commented 6 years ago

The following are our descriptive statistics:

> summary(meltdf[meltdf$Access == 'all-access',])
    Access             variable          value          
 Length:1071126     2015.07: 59507   Min.   :0.000e+00  
 Class :character   2015.08: 59507   1st Qu.:4.000e+02  
 Mode  :character   2015.09: 59507   Median :2.276e+03  
                    2015.10: 59507   Mean   :4.714e+04  
                    2015.11: 59507   3rd Qu.:2.195e+04  
                    2015.12: 59507   Max.   :1.294e+09  
                    (Other):714084                      
> summary(meltdf[meltdf$Access == 'mobile-web',])
    Access             variable          value          
 Length:532944      2015.07: 29608   Min.   :        0  
 Class :character   2015.08: 29608   1st Qu.:     3171  
 Mode  :character   2015.09: 29608   Median :    10631  
                    2015.10: 29608   Mean   :    41651  
                    2015.11: 29608   3rd Qu.:    28329  
                    2015.12: 29608   Max.   :231402053  
                    (Other):355296                      
> summary(meltdf[meltdf$Access == 'desktop',])
    Access             variable          value          
 Length:506916      2015.07: 28162   Min.   :1.500e+01  
 Class :character   2015.08: 28162   1st Qu.:3.565e+03  
 Mode  :character   2015.09: 28162   Median :1.124e+04  
                    2015.10: 28162   Mean   :5.323e+04  
                    2015.11: 28162   3rd Qu.:2.715e+04  
                    2015.12: 28162   Max.   :1.144e+09  
                    (Other):337944    

Additionally, we verified the Access column:

> mean(meltdf[meltdf$Access == 'desktop',3])
[1] 53229.77
jeff1evesque commented 6 years ago

a9e04c2: we've converted from boxplots to points, since the scale of each representation was too large, causing the interquartile range to be largely insignificant (at least visually). It may be likely to do a transformation on the elements (perhaps logarithmic). However, then the concept of the boxplot is no longer representative of the data points.

jeff1evesque commented 6 years ago

We need to adjust various legend titles.

jeff1evesque commented 6 years ago

It seems we also forgot to commit some additional changes prior to merging the earlier PR.

jeff1evesque commented 6 years ago

We need to adjust our color to Access for the visualization/points-total-access.png case, since it is redundant to color code, based on the y-axis.