Closed cells2numbers closed 4 years ago
Fixed in #135
library(readr)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(magrittr)
df <- read_csv('~/Downloads/population_ge_test.csv') %>%
filter(complete.cases(.))
#> Parsed with column specification:
#> cols(
#> .default = col_double(),
#> Metadata_broad_sample_simple = col_character()
#> )
#> See spec(...) for full column specifications.
# workaround for error: geError in parse(text = x) : <text>:1:7: unexpected input 1: 221227_ ^
colnames(df) <- c("Metadata_broad_sample_simple",1:977)
df %<>% mutate(strata_col = 1)
feature_columns <- setdiff(colnames(df),"Metadata_broad_sample_simple")
ge_normalized <- cytominer::normalize(
population = df,
variables = feature_columns,
sample = df,
strata = c("strata_col"),
operation = "standardize"
)
ge_normalized %>% select(1:3) %>% slice(1:5) %>% knitr::kable()
Metadata_broad_sample_simple | 1 | 2 |
---|---|---|
BRD-A01528713 | -1.2871379 | 1.0886862 |
BRD-A02809788 | -0.0555724 | -0.7771730 |
BRD-A03182941 | 0.5495872 | -0.9621648 |
BRD-A04691170 | -0.4334830 | -0.3546694 |
BRD-A08759443 | -0.1129926 | -0.5845238 |
Created on 2020-03-20 by the reprex package (v0.3.0)
Normalization fails in a simple case where cytominer is used to normalize a complete data set (no groups). Attached is a csv containing the features used in the provided example code below.
Tested with dplyr >0.8 so this issue could be related to #131 (not checked yet)
Example file population_ge_test.csv.tar.gz
Example:
Error:
sessionInfo: