Closed kivanvan closed 1 year ago
Would you mind checking what the histograms look like for those variables if you do hist(df$variablename, breaks = 8)
? Or could you share a small sample data set that does this? I'm puzzled by the 4th and 8th images since they don't do that.
Above are the histograms of the first two variables although the breaks argument works differently for each variable. I'm also attaching a subset of my data including the 4th and 8th variables (named as d and h) below. It is interesting that when I rerun skim()
on the subset, I can see the complete histograms for all variables. But for the 4th and 8th, it's the same image. My original data has 80 variables, all numerical. But I don't think it can be considered too big.
Are you knitting to HTML? I also see all four completely when I knit to html with your csv.
Or are you running interactively?
I just ran my code in the Rmarkdown and viewed the results in the preview window (not through the viewer).
I tested my csv file again, and I got the dots back in the first two histograms. This is so wired. Do you think it can be related to my R environment? I'm pasting the results from sessionInfo()
below.
R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] here_1.0.1 dplyr_1.1.0 phyloseq_1.42.0
loaded via a namespace (and not attached):
[1] nlme_3.1-160 bitops_1.0-7 matrixStats_0.63.0 bit64_4.0.5
[5] doParallel_1.0.17 RColorBrewer_1.1-3 httr_1.4.5 rprojroot_2.0.3
[9] GenomeInfoDb_1.34.9 repr_1.1.6 dynamicTreeCut_1.63-1 tools_4.2.2
[13] backports_1.4.1 utf8_1.2.2 R6_2.5.1 vegan_2.6-4
[17] rpart_4.1.19 Hmisc_4.8-0 DBI_1.1.3 BiocGenerics_0.44.0
[21] mgcv_1.8-41 colorspace_2.1-0 permute_0.9-7 rhdf5filters_1.10.0
[25] ade4_1.7-20 nnet_7.3-18 withr_2.5.0 tidyselect_1.2.0
[29] gridExtra_2.3 preprocessCore_1.60.2 bit_4.0.5 compiler_4.2.2
[33] WGCNA_1.72-1 cli_3.4.1 Biobase_2.58.0 htmlTable_2.4.1
[37] scales_1.2.1 checkmate_2.1.0 digest_0.6.30 stringr_1.5.0
[41] foreign_0.8-83 XVector_0.38.0 htmltools_0.5.4 base64enc_0.1-3
[45] jpeg_0.1-10 pkgconfig_2.0.3 fastmap_1.1.0 htmlwidgets_1.6.1
[49] rlang_1.0.6 impute_1.72.3 rstudioapi_0.14 RSQLite_2.3.0
[53] generics_0.1.3 jsonlite_1.8.3 RCurl_1.98-1.9 magrittr_2.0.3
[57] GO.db_3.16.0 GenomeInfoDbData_1.2.9 Formula_1.2-5 biomformat_1.26.0
[61] interp_1.1-3 Matrix_1.5-3 Rcpp_1.0.9 munsell_0.5.0
[65] S4Vectors_0.36.1 Rhdf5lib_1.20.0 fansi_1.0.3 ape_5.6-2
[69] lifecycle_1.0.3 stringi_1.7.8 MASS_7.3-58.1 zlibbioc_1.44.0
[73] rhdf5_2.42.0 plyr_1.8.8 grid_4.2.2 blob_1.2.3
[77] parallel_4.2.2 crayon_1.5.2 deldir_1.0-6 lattice_0.20-45
[81] Biostrings_2.66.0 splines_4.2.2 multtest_2.54.0 KEGGREST_1.38.0
[85] knitr_1.42 pillar_1.8.1 igraph_1.3.5 fastcluster_1.2.3
[89] reshape2_1.4.4 codetools_0.2-18 stats4_4.2.2 glue_1.6.2
[93] latticeExtra_0.6-30 data.table_1.14.6 png_0.1-8 vctrs_0.5.2
[97] foreach_1.5.2 tidyr_1.3.0 purrr_1.0.1 gtable_0.3.1
[101] cachem_1.0.7 ggplot2_3.4.1 xfun_0.36 skimr_2.1.5
[105] survival_3.4-0 tibble_3.1.8 iterators_1.0.14 AnnotationDbi_1.60.0
[109] memoise_2.0.1 IRanges_2.32.0 cluster_2.1.4
What I noticed is that when I ran things notebook style that I sometimes would get --- at the end because the window was not wide enough to accommodate the full width. Widening the window and rerunning fixed. But I agree that this is not optimal.
This is good to know. I can see the end bars without reruning if I widen my window to the very right. Thanks a lot!
When I use
skim()
to check the distribution of my data, I get some "..." in the histogram. It only exists in some variables. What do the dots mean?I ran the code in RMarkdown with R 4.2.2 and skimr_2.1.5.