Open vjernelov opened 1 year ago
For reference, image captured from PU-tjänsten's TKB, version 4.0
A suggestion for a regex for this is:
^(?:19|[2-9]\d)\d{2}(?:0[1-9]|1[012])(?:0[1-9]|[1-2]\d|3[0-1])\d{4}$
Maybe unlikely, but if e.g. health data from a cohort of patients where some are born before 1900 would be put on FHIR, this regexp would stop that. FHIR is one of the standards recognised by e.g. TEHDAS as a means to share data for secondary purposes...
A thought. Would it be better to not use regex, but instead pass the OID that reference the formatting? https://confluence.cgiostersund.se/display/PU/Identitetsformat
Then this would also support local replacement identification formats: https://confluence.cgiostersund.se/display/PU/Lokala+Reservidn
The application receiving/sending the information then knows how to parse the value (setting it and reading it).
Maybe unlikely, but if e.g. health data from a cohort of patients where some are born before 1900 would be put on FHIR, this regexp would stop that. FHIR is one of the standards recognised by e.g. TEHDAS as a means to share data for secondary purposes...
This is a valid point. According to https://www4.skatteverket.se/rattsligvagledning/edition/2023.1/330242.html we should also support individuals born in the 1800s. We should update the regex to reflect that if we agree this is a good idea.
A thought. Would it be better to not use regex, but instead pass the OID that reference the formatting? https://confluence.cgiostersund.se/display/PU/Identitetsformat
Then this would also support local replacement identification formats: https://confluence.cgiostersund.se/display/PU/Lokala+Reservidn
The application receiving/sending the information then knows how to parse the value (setting it and reading it).
My suggestion actually aligns well with this in that each SLICE would be getting its own regex. The advantage of describing the regex pattern in the actual profile is that validation of the resource automatically can be done using the HAPI library, meaning less work needs to be done by the implementors (if I understand the consequences of your suggestion correctly @johlju ).
The idea is that each type of patient identifier (personnummer, samordningsnummer, nationellt reservnummer) each would get their own regex pattern.
Supporting people born in the 1800s seems like a good idea as long as we have historical data to deal with. Both the earliest EHRs and national registers contain people born in the 1800s. The oldest living person with a verified age is however born as late as 1907-03-04.
^(18|19|[2-9]\d)\d{2}(0[1-9]|1[012])([0-2]\d|3[0-1])\w{4} would satisfy the requirements, no? Ping @RikardLovstrom, @danka74 and @johlju
Why the w{4} in the end? - is this supposed to match something else than personnummer also?
You're right, should be d{4} for personnummer.
^(18|19|[2-9]\d)\d{2}(0[1-9]|1[012])([0-2]\d|3[0-1])\w{4} would satisfy the requirements, no? Ping @RikardLovstrom, @danka74 and @johlju
Maybe make it non-capturing groups: ^(?:18|19|[2-9]\d)\d{2}(?:0[1-9]|1[012])(?:[0-2]\d|3[0-1])\d{4}
The idea is that each type of patient identifier (personnummer, samordningsnummer, nationellt reservnummer) each would get their own regex pattern.
So if the identification number is a local replacment identiy, take for example the format yyyymmdd-XYZW
that identifies a patient (see Skåne here https://confluence.cgiostersund.se/display/PU/Lokala+Reservidn). Would it be possible for the implementor to create.a new slice at runtime using the above format? Or does the Profile need to change, and re-published with a new slice for it to work? Not familiar with creating profiles, so read my question as such.
The idea is that each type of patient identifier (personnummer, samordningsnummer, nationellt reservnummer) each would get their own regex pattern.
So if the identification number is a local replacment identiy, take for example the format
yyyymmdd-XYZW
that identifies a patient (see Skåne here https://confluence.cgiostersund.se/display/PU/Lokala+Reservidn). Would it be possible for the implementor to create.a new slice at runtime using the above format? Or does the Profile need to change, and re-published with a new slice for it to work? Not familiar with creating profiles, so read my question as such.
@johlju So let's try to break down the answer to that question a bit:
If Region Skåne have a requirement to validate their local temporary identifiers, they should create their own profile, extending the base profile, and addind that regex verification as part of their profile using the same pattern we've done here
Since in theory all regions should support all of the local temporary identifiers (when moving patients between regions), then all systems that handle that patient would need to extend the profile to support different/all local temporary identifiers that Inera Personuppgiftstjänsten handles. This sounds like a potential issue when different vendors might do it differently.
In this case wouldn't it be better to have a slice where we can put the identificationnumber and another field where we put the OID that says how to format/or how to evaluate the identification number. Downside is of course that each system need to incorporate the regex for each identification number they should support, and also handle OIDs that is not supported. But at the same time the system would otherwise need to extend the profile and handle all logic with that anyway (as mentioned above). By adding the OID the profile the profile could be used by all out-of-the-box?
Hmm, I wasn't aware that PU-tjänsten had explicit descriptions of the local identifiers they support, including regexes!:
https://confluence.cgiostersund.se/display/PU/Lokala+Reservidn
This opens up the possibility to add even more identifier types to the base Patient profile, thus enabling profiles downstream to limit the amount of "own" work. I think that sounds like a good way forward, what do you say @johlju ? Would that address your concerns?
Well it would allow all to use the existing identifiers, but to use any additional that is added the base profile must be re-published. My suggestion by passing an OID instead of a regex we only need one slice (?) which would support all types of identifiers, present and future.
Not really seeing by have the regexes in the base profile is that beneficial.
Well it would allow all to use the existing identifiers, but to use any additional that is added the base profile must be re-published. My suggestion by passing an OID instead of a regex we only need one slice (?) which would support all types of identifiers, present and future.
Not really seeing by have the regexes in the base profile is that beneficial.
I think we have different understandings of a couple of things related to this.
On point 4 - there is value in doing that. For an example have a look at Australian Base profiles which have many profiles on the Identifier datatype for various identifiers used in the country:
Thanks to this list, it is easy to 'mix and match' these off the shelf definitions in your own profiles. The Australian Core Patient profile does this, for example:
I took the liberty of showcasing how this can be done in the branch attached to this issue. Please have a look when you have the time to see if you like the approach.
@larbo4 kolla med PU-tjänsten så att den OID-lista, de identitetstyper samt de regex som finns beskrivna på https://confluence.cgiostersund.se/display/PU/Lokala+Reservidn är "det senaste".
https://github.com/HL7Sweden/basprofiler-r4/blob/51-restrict-the-personnummer-slice-of-patientidentifiervalue-to-only-allow-digits-add-regex-for-all-known-identifier-types-used-nationally/input/fsh/patient.fsh leder er till förslaget på Patient-profilen. Raderna 90-102 innehåller OIDer och URIer för identitetstyperna.
På https://github.com/HL7Sweden/basprofiler-r4/blob/51-restrict-the-personnummer-slice-of-patientidentifiervalue-to-only-allow-digits-add-regex-for-all-known-identifier-types-used-nationally/input/fsh/invariants/PatientIdentifierConformancePatternSE.fsh finns de olika regex som används sammanställda. Om PU-tjänsten har regex beskrivna för de identitetstyper som stöds kan ni kolla dessa mot de som finns listade och sammanställa en eventuell "gap-rapport".
Niclas - kolla om regexdeklarationer via invariants stöds i olika tekniska implementationer av HL7 FHIR.
@larbo4 kolla med PU-tjänsten så att den OID-lista, de identitetstyper samt de regex som finns beskrivna på https://confluence.cgiostersund.se/display/PU/Lokala+Reservidn är "det senaste".
https://github.com/HL7Sweden/basprofiler-r4/blob/51-restrict-the-personnummer-slice-of-patientidentifiervalue-to-only-allow-digits-add-regex-for-all-known-identifier-types-used-nationally/input/fsh/patient.fsh leder er till förslaget på Patient-profilen. Raderna 90-102 innehåller OIDer och URIer för identitetstyperna.
På https://github.com/HL7Sweden/basprofiler-r4/blob/51-restrict-the-personnummer-slice-of-patientidentifiervalue-to-only-allow-digits-add-regex-for-all-known-identifier-types-used-nationally/input/fsh/invariants/PatientIdentifierConformancePatternSE.fsh finns de olika regex som används sammanställda. Om PU-tjänsten har regex beskrivna för de identitetstyper som stöds kan ni kolla dessa mot de som finns listade och sammanställa en eventuell "gap-rapport".
Niclas - kolla om regexdeklarationer via invariants stöds i olika tekniska implementationer av HL7 FHIR.
Det finns en uppdaterad lista med lokala reservidn, den hittar ni numera här: Lokala Reservid - Öppen info: Personuppgiftstjänsten - Confluence (atlassian.net)
Det finns 5 reservidn i listan som inte fanns med i den tidigare versionen: Region Västerbotten Region Halland Region Gävleborg Region Dalarna Och ytterligare en för VGR
Regex som finns beskrivna på confluencesidan för övriga överensstämmer med de som finns här: https://github.com/HL7Sweden/basprofiler-r4/blob/51-restrict-the-personnummer-slice-of-patientidentifiervalue-to-only-allow-digits-add-regex-for-all-known-identifier-types-used-nationally/input/fsh/invariants/PatientIdentifierConformancePatternSE.fsh Dvs inga skillnader utöver de fem tillagda reservid enligt ovan.
Det verkar finnas utmaningar kring generaliserbarheten av regex när vi skriver dessa i FSH. Niclas fick tips från Vadim om Regex101 som kan användas för att varje språk/implementation ska kunna göra sin egen representation av regex. Vi behöver förstå mer på djupet vad detta får för konsekvenser. Ser vi att regexet som genereras av SUSHI utifrån FSH inte är generellt tolkningsbart för samtliga relevanta implementationstekniker (Java, .Net, C##, Rust, PHP osv) så måste vi anamma en "mjukare" väg här.
Efter att ha kollat vidare på detta har det konstaterats att den typ av expressions vi använder för att säkerställa formatet på identiteterna inte bör innebära några problem ur ett implementationsperspektiv.
@vjernelov Jag har läst hela konversationen men är inte säker att jag är med på vad ni har landat i, har ni lagt till regex för personnummer enligt förslag längre upp (i så fall vilken då det listades olika alternativ, dessutom togs även regex upp för samordningsnummer och nationellt reservnummer, har de lagts till) eller inte?
PatientSEVendorLite har en regex för personnummer och samordningsnummer, ni kanske kom fram till att samma ska användas i basprofilen, se https://commonprofiles.care/fhir/1.0.1/StructureDefinition-PatientSEVendorLite.html? Är det tänkt att man kunde använda regexen för nationellt reserv id som specas på https://inera.atlassian.net/wiki/spaces/PU/pages/3353216812/Nationellt+Reservid?
@vjernelov Jag har läst hela konversationen men är inte säker att jag är med på vad ni har landat i, har ni lagt till regex för personnummer enligt förslag längre upp (i så fall vilken då det listades olika alternativ, dessutom togs även regex upp för samordningsnummer och nationellt reservnummer, har de lagts till) eller inte?
PatientSEVendorLite har en regex för personnummer och samordningsnummer, ni kanske kom fram till att samma ska användas i basprofilen, se https://commonprofiles.care/fhir/1.0.1/StructureDefinition-PatientSEVendorLite.html? Är det tänkt att man kunde använda regexen för nationellt reserv id som specas på https://inera.atlassian.net/wiki/spaces/PU/pages/3353216812/Nationellt+Reservid?
All the actions we took under the Patient profile update project can be found if you open the Patient project on the Project page, and then press the "Complete" status (yeah I know, a bit weird). That will give you a more detailed description of the reasoning and outcome of all issues.
Specifically regarding the patient ID types, we have the following:
Thank you @vjernelov Now I found your decision log and could see how you set the invariants under https://github.com/HL7Sweden/basprofiler-r4/blob/51-restrict-the-personnummer-slice-of-patientidentifiervalue-to-only-allow-digits-add-regex-for-all-known-identifier-types-used-nationally/input/fsh/invariants/PatientIdentifierConformancePatternSE.fsh
Currently the personnummer slice for Patient.identifier.value is a string type which opens up for arbitrary use (for example 19121212-1212, 191212121212 or 121212-1212 or 1212121212) which could mean problems for applications and FHIR Servers alike.
Given that the PU service expects a strict YYYYMMDDXXXX format, I suggest we enforce the same rule/restriction for the personnummer slice as well in the base profile.