Open docmarionum1 opened 5 years ago
Hi @docmarionum1 @ishiland I have the test data (for each function and each releases, including the UPAD update) used by the geographic research unit
at DCP, and it can be made public, would love to discuss how to build test cases, thanks!
Is the data public now or are you able to share it? It might help to get an idea of how to proceed by taking a look at how the data is structured, etc. I'm also curious about where the data will live and how future releases will be handled in an automated approach. Thanks for your help.
geosupport QA Data.zip Sorry for getting back so late, here are the files used for the QAQC, I think GRU generate new test data every release cycle, we can discuss how to create a data pipeline for this
So it looks like we will have to write some tests first then create a data pipeline as part of a separate issue. Im open to ideas on how to approach this, the data is a good start.
@SPTKL is there any metadata or descriptors for the fields in the test data? Im looking at the excel data in the root directory, there doesn't appear to be any field names.
There isn't, but based on my guess it's
geosupport version
, borough code
, house number
, street name
, GRC
, some other comments
As came up in #11, if possible we want to make sure that changes to the underlying data doesn't break our tests. But we also need to make sure that the tests could possibly catch changes to the data format.
The issue is: we want to make sure that the data coming out is being parsed correctly, which means we need to care about specific values. For instance, if the work area were expanded and the location of a field were changed, we would want the tests to fail, which necessarily means testing the values coming out and making sure they're being mapped to the correct place.
I wonder if the geosupport developers have any dummy data in there that they use to test it and ensure forward compatibility. If there's data we know won't change, then we could use that in the test cases.