Whilst going through our test data to anonymise it, particularly by culling out parts of the data that we don't actually use during imports, I've noticed a couple of things that the importer could do better, given the data that's available. This isn't important for the code (what we have works) but would be if we were to document that source:
We have code to find the 'most recent' address for people, but the data provides this directly in nyesteBeliggenhedsadresse inside deltagerpersonMetadata
We delve deep into the interests of every relationship listed to find the beneficial ownership relationships (among directorships and other types of relationship the data contains), but these relationship are actually flagged at a higher level in the virksomhedSummariskRelation/organisationer/organisationsNavn/navn by the value "Reelle ejere". This would potentially simplify the code because we could just find this relationship, then process all the interests (medlemsData) in two discrete steps, rather than the slight confusing loop-and-break structure we have now.
It's probably also worth noting in any docs that we're ignoring lots of historical data about companies and people (old names, old addresses) and there are a lot of other fields which we don't really understand/use at the moment.
Whilst going through our test data to anonymise it, particularly by culling out parts of the data that we don't actually use during imports, I've noticed a couple of things that the importer could do better, given the data that's available. This isn't important for the code (what we have works) but would be if we were to document that source:
nyesteBeliggenhedsadresse
insidedeltagerpersonMetadata
virksomhedSummariskRelation
/organisationer
/organisationsNavn
/navn
by the value "Reelle ejere". This would potentially simplify the code because we could just find this relationship, then process all the interests (medlemsData
) in two discrete steps, rather than the slight confusing loop-and-break structure we have now.It's probably also worth noting in any docs that we're ignoring lots of historical data about companies and people (old names, old addresses) and there are a lot of other fields which we don't really understand/use at the moment.