Closed MattStammers closed 1 year ago
@MattStammers , thank you for reporting this.
Can you please add some more details to the first post to describe how one may replicate this problem? Based on the WhatsApp messages, I understand that there may be a problem with the ACSC mapping of diag_01
variable in the admitted care dataset.
If this is correct, then this will affect most of the generated tables.
There may be a problem with how this feature is built. Would you mind having a look at the following sections of code to see what may be broken?
And definitely worth checking the Sheffield mapping spec google sheet - link can be reconstructed from the second code section above. feature_maps.py
directly reads this google sheet and uses the data for the mapping.
Happy to look at a PR if there is a bug at this end.
Right @georgm8 has looked at the code and found the issue
our ICD codes had a full stop in them which was not being picked up. Presently it was replacing this with a space. Either we need consensus on the format of the ICD10 codes or a regex validator to be added. I think @georgm8 is happy to do this?
Happy to create a validator once we have consensus on how we want the ICD-10 codes to look
The Sheffield spec uses a period .
after the first three characters if there are more than 3 characters. At LTH, we don't have a .
in the raw data and found that the simplest thing to do was to remove this from the Sheffield spec google sheet.
For _acuteadmits dataset,
My view is that we get rid of the .
to keep things simple.
~Just found that we are not doing this in the ECDS dataset.~
Ignore this as the ECDS mapping is SNOMED to category and should not be affected. The ECDS mapping google docs however also has ICD codes in it and worth exploring further.
@MattStammers , feel free to close this issue if #35 by @georgm8 addresses this.
@quindavies , we will have to run the new changes in #35 against out admitted care dataset and make sure there are no surprises.
I have tried overwriting all the first diagnoses with codes like cellulitis '128045006' and the validator is not picking these up flagging everything as non-ACSC on the final step.