ONSdigital / sdg-SDMX-data-qualifier

MIT License
1 stars 0 forks source link

Get all unique column names #16

Closed jwestw closed 3 years ago

jwestw commented 3 years ago
jwestw commented 3 years ago

Time estimate --> 1.5 hours

jwestw commented 3 years ago

@LucyGwilliamAdmin is there any chance that 2 column titles could be the same but mean different things in their respective datasets? I am struggling to think of a good example, but something like "Region" could mean "Region of UK" in one dataset and "Region of the world" in another. This example does not apply to our data well, but it illustrates what could be a problem.

LucyGwilliamAdmin commented 3 years ago

@jwestw apologies i'm not getting notifications for this repo - not sure why because i'm watching it and you tagged me...

That's a good question. I'm not too sure and I can't think of a possible scenario. This might be something worth checking with the data team

jwestw commented 3 years ago

Spoke with Emma and got some insights into unusual cases in our data.

14.4.1 has column names that ideally would be the same ("sustainability level) but refer to different series.

3.2.2 Age means "age of mother" - i.e. maybe not the age of the baby/subject.

Might have the same concept called slightly different names, e.g. "Industry" and "Sector".

Some disggregation values are sometimes the same but mean different things. e.g. UK could the headline figure or could mean country of Birth. So now, UK is "UK born".