Create and use de-identified research databases. Preprocess, extract text, anonymise/de-identify, link, apply natural language processing, query for research, manage consent for contact.
Patients now have more realistic names (generated with Faker)
NHS numbers use the test range
Everyone gets their own postcode and phone number
Notes include alcohol consumption in various forms to demonstrate NLP.
Notes are padded with random words to reach the approximate word limit target using Faker rather than/usr/share/dict/words
Ronald McDonald and Bob d'Souza have gone but some aspects of their notes are distributed between other patients. From what I can see there are already comprehensive tests covering the various date, postcode and phone number formats that these records were testing.
Also bumps cryptography to fix GHSA-5cpq-8wj7-hf2v.
Changes to
crate_make_demo_database
:/usr/share/dict/words
Ronald McDonald and Bob d'Souza have gone but some aspects of their notes are distributed between other patients. From what I can see there are already comprehensive tests covering the various date, postcode and phone number formats that these records were testing.
Also bumps cryptography to fix GHSA-5cpq-8wj7-hf2v.