maelstrom-research / Rmonize

3 stars 0 forks source link

Visual report on harmonized dossier "Error in `combine_vars()`" #53

Closed twey2 closed 1 month ago

twey2 commented 5 months ago

I get an error when trying to run a visual report on some harmonized dossiers, which is hard to diagnose. The function gets through the assessment without issue, then stops partway through "Generate report". There is the message "Faceting variables must have at least one value", so it seems like there might be a problem with a grouping variable, but I can't tell if it is an error with one specific variable or a more general issues with the whole dataset.

The error only happened about 3 times out of >20 harmonized dossiers I've run for this project (for the current DataSchema versions), which all follow the same process and DataSchemas, so I guess it is an issue with specific datasets and/or variable formats. Not sure if this is relevant, but there are two different DataSchemas for the project, so up to two harmonized dossiers per study, and the error can happen in one or the other (e.g., it is not specific to one DataSchema).

The error happens with a version of Rmonize installed from CRAN in Jan 2024 and also happened in a version from late 2023 (sorry, I don't know exact date/version).

Any help on how to diagnose the problem would be appreciated!

image

GuiFabre commented 4 months ago

thank you Tina for your issue. It seems that it is duplicated with this one. I'll make sure both of them are corrected soon

GuiFabre commented 3 months ago

Should be working now. The problem was when the words in a character string are separated, some of the chuncks belong to the list of stop_words() which internal exclude words such as "and" , "&", "the", and unique character. So the expression "a&the" was totally removed.

Now corrected.

image

a-trottier commented 2 months ago

tested with a few stop words (some from the tidytext version) and seams to work