medizininformatik-initiative / kerndatensatzmodul-diagnose

Kerndatensatzmodul Diagnose
1 stars 3 forks source link

add AlphaID 2024 #43

Open jpwiedekopf opened 6 months ago

jpwiedekopf commented 6 months ago

This commit adds Alpha-ID-SE 2024, converted from the sources available from the BfArM using a Work-in-progress version of my BabelFSH tool.

Please review mainly w.r.t. the metadata and properties available for the concepts.

I'll attach the BabelFSH file here, and comment about the inner workings of BabelFSH below. alphaid-2024.babelfsh.fsh.gz

jpwiedekopf commented 6 months ago

The main idea of BabelFSH is that defining CodeSystems from external sources often comes down to doing the same thing over and over again: defining the metadata in some way, sometimes hard-coded, sometimes pretty. Then, the conversion process basically takes in data in some format, and spits out a list of concepts, sometimes with concepts, sometimes not.

I've taken the FSH grammar available in the SUSHI source code and adapted it slightly to recognize a specific comment as a meaningful token. By then parsing and evaluating FSH using this grammar and ANTLR4, we can take metadata defined in valid FSH, and then delegate comment generation to a downstream plugin.

These plugins (well, I've only written a "CSV" plugin so far, but that's already quite powerful) is invoked after the metadata is generated from the FSH code. Each plugin takes arguments which look like a POSIX command line (so I can use a library for command line parsing to make things easier 😎), and has a very simple API: generate lists of concepts in either R4B or R5. At least for the CSV plugin, the arguments are really generic, so we don't need to hardcode any mappings. I'm also planning to be able to generate ValueSets based on some properties in the master file, but I'm focusing on CS for now.

But of course, having really specific plugins is absolutely possible! For example, I'm planning to use BabelFSH to generate CodeSystems for the EDQM Standard Terms using that API. That's going to be really specific source code, but that really isn't a problem with this approach anymore. The same goes for stuff I've written in the past (e.g. OncoTree) or something like CLaML2FHIR, which could also be integrated (indeed, I'll do just that, since the license for CLaML2FHIR allows that possibility). Next on the roadmap will be ORPHAcodes because the KDS Diagnose urgently requires that, but the absolute earliest I can deliver that is during the week of March 4 (KW10).