Open awolfe76 opened 9 years ago
I was going to take a stab at this one for this sprint, but I think its worth a bit of discussion first as there are several variables at play here. Below are samples of /geocode
requests with with successful and failed address parsings. Thoughts on options for improving this coming soon...
GET /geocode/200 President St Arkansas City AR 71630
{
"status": "OK",
"query": {
"input": "200 President St Arkansas City AR 71630",
"parts": {
"city": "Arkansas City",
"zip": "71630",
"state": "AR",
"streetName": "President St",
"addressNumber": "200"
}
},
"addressPointsService": {
"status": "OK",
"features": [{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [-91.19978780015629, 33.608091616155995]
},
"properties": {
"address": "200 President St Arkansas City AR 71630",
"alt_address": "",
"match": 1.0
}
}]
},
"censusService": {
"status": "OK",
"features": [{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [-91.19960153268617, 33.60763673811005]
},
"properties": {
"RFROMHN": "100",
"RTOHN": "498",
"ZIPL": null,
"FULLNAME": "President St",
"LFROMHN": null,
"LTOHN": null,
"ZIPR": "71630",
"STATE": "AR"
}
}]
}
}
GET /geocode/200 President St
{
"status": "OK",
"query": {
"input": "",
"parts": {
"city": "",
"zip": "",
"state": "",
"streetName": "",
"addressNumber": ""
}
},
"addressPointsService": {
"status": "OK",
"features": [{
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [-91.19978780015629, 33.608091616155995]
},
"properties": {
"address": "200 President St Arkansas City AR 71630",
"alt_address": "",
"match": 0.41025641025641024
}
}]
},
"censusService": {
"status": "ADDRESS_NOT_FOUND",
"features": []
}
}
I think I get where you're coming from, but my thought was that no matter what, as long as we can return a response at all, we should return the query.input
value.
Meaning that even if all services fail we should see something like:
GET /geocode/200 President St
{
"status": "OK",
"query": {
"input": "200 President St",
"parts": {
"city": "",
"zip": "",
"state": "",
"streetName": "",
"addressNumber": ""
}
},
"addressPointsService": {
"status": "ADDRESS_NOT_FOUND",
"features": []
},
"censusService": {
"status": "ADDRESS_NOT_FOUND",
"features": []
}
}
It might be useful to somehow show the parser status as well, rather than, or in addition to, the parts
returning empty strings.
Maybe break out the parts
as its own object? That is the parser, right?
If I'm right, maybe:
GET /geocode/200 President St
{
"status": "OK",
"input": "200 President St",
"parserService": {
"status": "ADDRESS_NOT_PARSED",
"parts": {}
},
"addressPointsService": {
"status": "ADDRESS_NOT_FOUND",
"features": []
},
"censusService": {
"status": "ADDRESS_NOT_FOUND",
"features": []
}
}
Basically as long as the geocode is "status": "OK"
, we should, at a minimum return the input
.
Maybe that's what you meant by
there are several variables at play here.
:smile:
@awolfe76, I actually wasn't proposing anything yet. :) The examples above are how the service works today. I just put those up as reference for further discussion. Now that I read my comment, that isn't particularly clear.
I tend to agree that we should move input
outside of the parser. It will probably make the code simpler on the geocoder if that's the case as well.
More thoughts on this to come...
Understood. Just wanted to throw out a few ideas too.
I took a few stabs at this a couple weeks ago, but each approach has some snags, and requires a fair amount of refactoring.
input
top-level attributeMy first stab was actually refactor the format as we discussed
{
"status": "OK",
"input": "200 President St",
"parserService": {
"status": "ADDRESS_NOT_PARSED",
"parts": {}
},
"addressPointsService": {
...
}
...but there is already quite a lot of code that is dependent on the input
coming along for the ride with the parser's response. I started to refactor accordingly, but then decided to hold off as it was starting to be quite a lot of change.
input
on failed parseI got this working pretty easily for the standard GET /geocode
request, but I ran into snags with GeocodeFlow
. In the following function...
def parseFlow: Flow[String, ParsedAddress, Unit] = {
Flow[String]
.mapAsync(4)(a => AddressParserClient.standardize(a))
.map { x =>
if (x.isRight) {
x.right.getOrElse(ParsedAddress.empty)
} else {
ParsedAddress.empty
}
}
}
...AddressParserClient.standardize(a)
returns either a ResponseError
(the Left
) or a ParsedAddress
(the Right
). Unfortunately, ResponseError
is very generic, and does not contain the input address string, so you're out of luck.
I'm not going to make any changes at this point. I think the better option is to do the full refactoring necessary for moving input
up to a top-level attribute, and refactoring the Flow
, which is necessary in either scenario.
[GHE]/HMDA-Operations/hmda-devops/issues/79
When running this query:
/geocode/200 President St Arkansas City AR 71630
Everything is great and you see the input
"input": "200 President St Arkansas City AR 71630",
But, If you query:
/geocode/200 President St
You also get a result but the input is:
"input": "",
The input in this case should be:
"input": "200 President St",
I think that the input should be populated even if no results are returned as well.