18F / data-act-pilot

This small DATA Act pilot contains code that translates agency data to a uniform DATA act format.
Other
21 stars 14 forks source link

What's the purpose of having four input files? #176

Open bsweger opened 8 years ago

bsweger commented 8 years ago

Logging this great architecture question that emerged from an agency conversation last week:

What's the purpose of having four input files that will be ingested and validated/standardized centrally? In our case, we'd prefer to create a single file and do the validations and final conversion before submitting the data. We want to be as close to the source data as possible when running validations/conversions because that shortens the loop for making fixes.

For example, the person who sends the data to Treasury is likely not the person who can fix validation errors. So if we don't know about problems until after the data is submitted, the errors have to get reported back to the submitter, who in turn will have to report them to the people who will fix them. This means a longer lag time and the possibility of problems getting lost in translation.

bsweger commented 8 years ago

Adding my own thoughts to this great question...

One way to strike a balance between agencies that want to run validations/conversions "closer to the metal" and agencies that may not have the resources to do this would be to expose the broker's functionality via API/web service.

This would allow agencies to invoke data validations and/or conversions as needed and get the results back. Other agencies could submit via the website if that works better for them. But everyone is subject to the same set of validations, etc.

Bonus: what about a write API that would handle automated data submission (which could also address user story #151)?