Closed signekb closed 5 months ago
In general, for creating data used within the package, the code to create it should be not included as part of the package (since it might include dependencies that aren't actually needed by the package for intended uses). That's why I moved the code over into data-raw/
(set up using the usethis::use_data_raw()
).
@lwjohnst86 Your comments seem to be mostly on the medication test data, which is actually not a part of this PR (test data for lab and health insurance). But I guess @Aastedet can take a look at the comments, since I don't really have an overview of what's happening in that part of the script either :) I will add @Aastedet as assignee for this PR (assignee = actively working on the PR and being responsible for getting it into a merge-ready state).
@signekb I didn't realize until later that it was added by Anders (?) earlier, since the PR showed it all coming from you :relaxed: though I did know that you hadn't added that code when I made the comments because of reading the #4 issue :relaxed:
Totally fine - I understand the confusion 👍
@Aastedet @lwjohnst86 Status on this? Anything I can do for this to become ready to merge?
@signekb We'll get to this when we start the focus period next week. More likely it will be me and you figuring things out and getting feedback from @Aastedet during meetings.
Are we waiting for me to add the fake diagnosis data (I will, I promise 😄 ) before merging or do I need to do more in terms of reviewing the PR?
@Aastedet there are still a lot of questions I have and the code needs a lot of work, that's why I haven't merged it in yet. I think it would be better to first do #32 before creating the example/test data, because it would make it easier to build those if we have the variable list well defined and set up first.
@lwjohnst86 Cool - I'll get #32 done ASAP.
This pr includes the following:
create_test_lab_df()
increate_test_data.R
lab_df
increate_test_data.R
using thecreate_test_lab_df()
.lab_df
has 100 rows (1 row per individual)create_test_health_insurance_df()
function increate_test_data.R
health_insurance_df
in increate_test_data.R
using thecreate_test_health_insurance_df()
.health_insurance_df
has 100 rows (1 row per individual)Can you check whether these dfs follow your descriptions in #4, @Aastedet? From the description, I'm not sure whether there should be multiple rows per individual? Currently, both functions sample from 001-100 with replacement, meaning that with 100 samples some ID's will appear multiple times and some ID's in the range will not appear. Is that what you imagined?
Do we maybe want to keep the test data functions and creation of the test data in separate scripts?