nhs-r-community / NHSRwaitinglist

R-package to implement the waiting list management approach described in this paper by Fong et al: https://www.medrxiv.org/content/10.1101/2022.08.23.22279117v1.full-text
https://nhs-r-community.github.io/NHSRwaitinglist/
Other
14 stars 0 forks source link

Fixed synthetic data set #51

Closed Lextuga007 closed 2 months ago

Lextuga007 commented 5 months ago

From the catch up on 12th April 2024 we discussed a fixed synthetic dataset which would complement the issue #23 and was also discussed in relation to the {NHSRdatasets} package.

Points to consider are if this should be a standalone data set if it's generalised enough and could be used for other example analysis and if that is the case, whether that go into the existing {NHSRdatasets} package or its own. A data set that's very specific to the examples of this package would suggest it's better placed here rather than somewhere else that would result in a dependency in the packages.

This doesn't necessarily have to be instead of the function and could complement the flexibility that generating particular data could offer.

Any other views on this are very welcome!

jacgrout commented 5 months ago

The addition of two functions through issue #23 called create_waiting_list() and create_bulk_synthetic_data() plus the set up and creation of demo-data with data/ and data-raw/ folders creates a dataframe called demo_df which is a dummy data set from which a synthetic set can be generated as follows:

NHSRWaitinglist::create_bulk_synthetic_data(demo_df)

This generates a single synthetic data set of 5 waiting lists using the contents of the built in demo dataframe demo_df as the input parameters for each waiting list

This could be extended to add another piece of code to create a fixed synthetic dataset maybe using set.seed with the demo_df to create a dataframe of fixed synthetic patient level data?

I'm also wondering whether we could in some way use existing data in NHSRdatasets to generate input parameters for a selection of waiting lists and switch the code in demo-data to match?