Open poulson opened 3 years ago
I believe the issue is that the UK assigns the same id
to all awards, which the unflatten routine ends up merging into a single row. I believe the command outputs a warning about this, but I could be wrong.
I think if you delete the id
values, then the command will work as expected.
When run from the commandline I don't see any warnings, but I will indeed look into preprocessing out the id
values and appreciate the tip.
FWIW, I have confirmed that dropping the suppliers/id
columns before unflattening fixes the problem. I agree that one would hope a warning would have been printed about this and appreciate the help.
2021.04.20.csv
Despite the clear existence of 31 separate suppliers (including Palantir Technologies) in this single-row input CSV (extracted from yesterday's export of UK government procurement contracts), unflatten is only preserving the last supplier (Workday).
I have been unflattening using a command of the form
The relevant -- and incomplete -- portion of the output is:
While I understand that a schema would be of use, I don't understand after reading https://flatten-tool.readthedocs.io/en/latest/unflatten/ why most of the supplier columns are being entirely ignored. I am therefore posting here because this looks like a bug in unflatten.