vusaverse / evalytics

Data pipeline for evalytics data.
MIT License
0 stars 1 forks source link

Add script to join tables on organisation #4

Open JornGitHub opened 3 months ago

JornGitHub commented 3 months ago

Linked to #3.

Add script to join tables on organisation.

Tomeriko96 commented 3 months ago

Hi @JornGitHub ,

Please rename the pull request, to make its purpose clear.

Tomeriko96 commented 3 months ago

Hi @JornGitHub ,

A few comments:

  1. The joins produce column names with .x and .y suffixes. You can check these by running the function identify_join_suffixes from the vusa package.:
> vusa::identify_join_suffixes(dfOrganisationJoined)
 [1] "name.x"         "code.x"         "externalId.x"   "id.y"           "name.y"         "description.x"  "externalId.y"  
 [8] "archived.x"     "id.y.y"         "name.x.x"       "code.y"         "externalId.x.x" "archived.y"     "id.y.y.y"      
[15] "name.y.y"       "description.y"  "externalId.y.y"

And to prevent these suffixes, see example below:

## Join the two data frames
dfQuestion <- dfQuestionScaleValue %>%
  left_join(dfQuestionScale, by = c("questionScaleHash" = "hash"), suffix = c(".value", ".scale"))
  1. You should consider changing the order of the joins

Currently, the organisation table is used as base. However, it seems to be more like a type of mapping table. You should probably use one of the more granular tables as base instead.