maelstrom-research / Rmonize

3 stars 0 forks source link

harmonized_dossier_visualize() error: Faceting variables must have at least one value #49

Closed zchenmr closed 1 month ago

zchenmr commented 6 months ago

R CITF: Rmonize v1.0.1.0003, madshapR v1.0.3.0003

I got the error below while running harmonized_dossier_visualize on a harmonized dossier. The exact same script runs without errors when using other harmonized dossiers. Looking at variable 141 (where the error seems to occur), it is an open-text variable with a single value: "J&J".

image

GuiFabre commented 6 months ago

Hello, and thank you for your contribution. This bug is to be investigated, but will be corrected in a future version.

Thank you :)

GuiFabre commented 3 months ago

Should be working now. The problem was when the words in a character string are separated, some of the chuncks belong to the list of stop_words() which internal exclude words such as "and" , "&", "the", and unique character. So the expression "J&J" was totally removed.

Now corrected.

image

zchenmr commented 3 months ago

Rmonize 1.0.2.1003, madshapR 1.0.4.1006

I'm getting the following (possibly unrelated) error when I try to run the code:

image

Traceback:

44: stop(fallback)
43: signal_abort(cnd, .file)
42: abort(message, class = c(class, "vctrs_error"), ..., call = call)
41: stop_vctrs(message, class = c(class, "vctrs_error_incompatible"), 
        x = x, y = y, details = details, ..., call = call)
40: stop_incompatible(x, y, x_size = x_size, y_size = y_size, ..., 
        x_arg = x_arg, y_arg = y_arg, details = details, message = message, 
        class = c(class, "vctrs_error_incompatible_size"), call = call)
39: stop_incompatible_size(x = .Primitive("quote")(structure(list(
        variable = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
     ...
38: vec_cbind(!!!dots, .name_repair = .name_repair, .error_call = current_env())
37: bind_cols(., preprocess_var[c("valid_class", "value_var")])
36: mutate(., variable = ifelse(.data$valid_class == "3_Valid other values", 
        .data$value_var, .data$variable))
35: select(., -c("valid_class", "value_var"))
34: rename_with(., .cols = c("variable", "group")[1:ncol(colset)], 
        .fn = ~names(colset))
33: colset %>% rename_with(.cols = any_of(names(colset)), .fn = ~c("variable", 
        "group")[1:ncol(colset)]) %>% bind_cols(preprocess_var[c("valid_class", 
        "value_var")]) %>% mutate(variable = ifelse(.data$valid_class == 
        "3_Valid other values", .data$value_var, .data$variable)) %>% 
        select(-c("valid_class", "value_var")) %>% rename_with(.cols = c("variable", 
        "group")[1:ncol(colset)], .fn = ~names(colset))
32: variable_visualize(dataset, col = "adm_proxy", data_dict = data_dict, 
        group_by = "Rmonize::harmonized_col_dataset", valueType_guess = "FALSE", 
        variable_summary = dataset_summary)
...
GuiFabre commented 3 months ago

Yes it is related to the previous amend of tolowerization of text. Good catch !

Can you test it again ?

zchenmr commented 3 months ago

The report is now generated without errors:

image
GuiFabre commented 3 months ago

perfect. Merci