Closed jshermancody closed 6 years ago
The search slowness should be fixed now with a better caching mechanism we have used on the api side.
The search function is purely dependent on Mapbox's location API. There isn't much we can do at this point to improve unless we decide to either switch providers or implement a more complicated search algorithm. I'll mark this as won't fix
for now.
I took a quick look at the search function. It's hard to dig too much because I can't get local logs working, but I poked at the code and found a couple issues:
The code for state matching is commented out: https://github.com/developmentseed/api.work.vote/blob/master/apps/api/views.py#L209 If I uncomment and append those results to the response first, when the user types, say "Ala" then "Alabama" and "Alaska" are returned as the first two results. I'm not sure why this is commented out.
The results for jurisdictions are always ordered alphabetically, and we set the limit to 20 results from Mapbox. This ignores any ranking that Mapbox does. and it does look like Mapbox has some kind of ranking. For example, if I allow 20 results, my search for "Columbus, Ohio" comes back with Bartholomew County IN and Franklin County OH, but if I only allow 1 result it picks Franklin County, which shows it correctly ranked the area there highest.
We also do exact name matching on jurisdictions, but then put the results alphabetically.
We drop the results for states that aren't active. This was really confusing for me coming from Michigan, I had no idea why typing my address didn't work.
Probably changing the UI to focus on the map would be the best thing to do.
But we could also make some changes to the searchbox:
@martymoo @nayelipelayo what do you think?
Looks like states are commented out because of lack of searchbox support, fixing that in https://github.com/developmentseed/work.vote/pull/111.
Hi Annie,
My responses below.
From what you say, it sounds like uncommenting with generate better search results? If that's the case, I think we can do that. I would want to ask Alireza why it was commented out in the first place.
Is there a way to generate less than 20 results and have the results be ranked more relevantly? My concern is that there might be some search entries that have names that can belong to different counties. Like Las Vegas, NM and Las Vegas, NV.
It might be better for us to include the jurisdictions that are inactive and have the landing page have a message that says the state will be available soon. Information for the inactive states is slowly trickling in as election officials fill out the form.
-
On Thu, Oct 25, 2018 at 9:04 PM anniesullie notifications@github.com wrote:
I took a quick look at the search function. It's hard to dig too much because I can't get local logs working, but I poked at the code and found a couple issues:
1.
The code for state matching is commented out: https://github.com/developmentseed/api.work.vote/blob/master/apps/api/views.py#L209 If I uncomment and append those results to the response first, when the user types, say "Ala" then "Alabama" and "Alaska" are returned as the first two results. I'm not sure why this is commented out. 2.
The results for jurisdictions are always ordered alphabetically, and we set the limit to 20 results from Mapbox. This ignores any ranking that Mapbox does. and it does look like Mapbox has some kind of ranking. For example, if I allow 20 results, my search for "Columbus, Ohio" comes back with Bartholomew County IN and Franklin County OH, but if I only allow 1 result it picks Franklin County, which shows it correctly ranked the area there highest. 3.
We also do exact name matching on jurisdictions, but then put the results alphabetically. 4.
We drop the results for states that aren't active. This was really confusing for me coming from Michigan, I had no idea why typing my address didn't work.
Probably changing the UI to focus on the map would be the best thing to do.
But we could also make some changes to the searchbox:
- Encourage the user to type a county instead of an address.
- Only return the top geocoded result, plus any state or jurisdiction matching the user's query exactly (for example, "Minnesota" and "Minnehaha County" for the query "Minn"). This would allow us to still get a result when the user types an address, but not add a ton of confusing low-ranking results.
- Put results that exactly match the query at the top of the list. So if I search for "Franklin", I'll see the various Franklin counties listed before the top geocoded result, Williamson County.
- Include results from states that are inactive, with some kind of note ("Washtenaw county not available yet")
@martymoo https://github.com/martymoo @nayelipelayo https://github.com/nayelipelayo what do you think?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-433253300, or mute the thread https://github.com/notifications/unsubscribe-auth/ApaOyKRw6IoHt8-AQ6-OMKOlwhv12Rpxks5uol-cgaJpZM4WucGV .
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Washington, D.C. 20006 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
I don't follow this thread - how does it become "commented out"? But we should speak before any change is made to the search function - I need to approve that.
On Mon, Oct 29, 2018 at 11:25 AM nayelipelayo notifications@github.com wrote:
Hi Annie,
My responses below.
From what you say, it sounds like uncommenting with generate better search results? If that's the case, I think we can do that. I would want to ask Alireza why it was commented out in the first place.
Is there a way to generate less than 20 results and have the results be ranked more relevantly? My concern is that there might be some search entries that have names that can belong to different counties. Like Las Vegas, NM and Las Vegas, NV.
It might be better for us to include the jurisdictions that are inactive and have the landing page have a message that says the state will be available soon. Information for the inactive states is slowly trickling in as election officials fill out the form.
-
- If the address entry is an issue, maybe we can just have that part removed and allow then to enter city, and zip still?
- The geocoded results could work. I think for this part we would want to check with Alireza to ensure he doesn't have have it set as it currently is for a specific reason.
- Putting results that exactly match the query at the top of the list sounds like a good idea.
- Include results from states that are inactive, with some kind of note ("Washtenaw county not available yet") seems like a good idea.
On Thu, Oct 25, 2018 at 9:04 PM anniesullie notifications@github.com wrote:
I took a quick look at the search function. It's hard to dig too much because I can't get local logs working, but I poked at the code and found a couple issues:
1.
The code for state matching is commented out:
https://github.com/developmentseed/api.work.vote/blob/master/apps/api/views.py#L209 If I uncomment and append those results to the response first, when the user types, say "Ala" then "Alabama" and "Alaska" are returned as the first two results. I'm not sure why this is commented out. 2.
The results for jurisdictions are always ordered alphabetically, and we set the limit to 20 results from Mapbox. This ignores any ranking that Mapbox does. and it does look like Mapbox has some kind of ranking. For example, if I allow 20 results, my search for "Columbus, Ohio" comes back with Bartholomew County IN and Franklin County OH, but if I only allow 1 result it picks Franklin County, which shows it correctly ranked the area there highest. 3.
We also do exact name matching on jurisdictions, but then put the results alphabetically. 4.
We drop the results for states that aren't active. This was really confusing for me coming from Michigan, I had no idea why typing my address didn't work.
Probably changing the UI to focus on the map would be the best thing to do.
But we could also make some changes to the searchbox:
- Encourage the user to type a county instead of an address.
- Only return the top geocoded result, plus any state or jurisdiction matching the user's query exactly (for example, "Minnesota" and "Minnehaha County" for the query "Minn"). This would allow us to still get a result when the user types an address, but not add a ton of confusing low-ranking results.
- Put results that exactly match the query at the top of the list. So if I search for "Franklin", I'll see the various Franklin counties listed before the top geocoded result, Williamson County.
- Include results from states that are inactive, with some kind of note ("Washtenaw county not available yet")
@martymoo https://github.com/martymoo @nayelipelayo https://github.com/nayelipelayo what do you think?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/developmentseed/work.vote/issues/81#issuecomment-433253300 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ApaOyKRw6IoHt8-AQ6-OMKOlwhv12Rpxks5uol-cgaJpZM4WucGV
.
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 < https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Washington, D.C. 20006 < https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-433952280, or mute the thread https://github.com/notifications/unsubscribe-auth/ApYEQD_B8vOGq7tbYSHyNI4YAxsK9F7Vks5upx35gaJpZM4WucGV .
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
I don't follow this thread - how does it become "commented out"? But we should speak before any change is made to the search function - I need to approve that.
In source code, there are lines called "comments" that don't run, they are just for documentation. When a comment marker is put in front of a line of code, it stops running. This is what happened here, the lines including states in the searchbox had comment markers put at the start: https://github.com/developmentseed/api.work.vote/commit/8fc7b8faa16d28bba7e724224e50bc9e2c160d6c
Should you, me and @nayelipelayo meet to sort out the best path forward?
Annie is connecting with Alireza today and afterwards she and I will connect. I'm happy to include you in the conversation or can fill you in. No changes have been made to the search function.
On Mon, Oct 29, 2018 at 12:16 PM jshermancody notifications@github.com wrote:
I don't follow this thread - how does it become "commented out"? But we should speak before any change is made to the search function - I need to approve that.
On Mon, Oct 29, 2018 at 11:25 AM nayelipelayo notifications@github.com wrote:
Hi Annie,
My responses below.
From what you say, it sounds like uncommenting with generate better search results? If that's the case, I think we can do that. I would want to ask Alireza why it was commented out in the first place.
Is there a way to generate less than 20 results and have the results be ranked more relevantly? My concern is that there might be some search entries that have names that can belong to different counties. Like Las Vegas, NM and Las Vegas, NV.
It might be better for us to include the jurisdictions that are inactive and have the landing page have a message that says the state will be available soon. Information for the inactive states is slowly trickling in as election officials fill out the form.
-
- If the address entry is an issue, maybe we can just have that part removed and allow then to enter city, and zip still?
- The geocoded results could work. I think for this part we would want to check with Alireza to ensure he doesn't have have it set as it currently is for a specific reason.
- Putting results that exactly match the query at the top of the list sounds like a good idea.
- Include results from states that are inactive, with some kind of note ("Washtenaw county not available yet") seems like a good idea.
On Thu, Oct 25, 2018 at 9:04 PM anniesullie notifications@github.com wrote:
I took a quick look at the search function. It's hard to dig too much because I can't get local logs working, but I poked at the code and found a couple issues:
1.
The code for state matching is commented out:
https://github.com/developmentseed/api.work.vote/blob/master/apps/api/views.py#L209
If I uncomment and append those results to the response first, when the user types, say "Ala" then "Alabama" and "Alaska" are returned as the first two results. I'm not sure why this is commented out. 2.
The results for jurisdictions are always ordered alphabetically, and we set the limit to 20 results from Mapbox. This ignores any ranking that Mapbox does. and it does look like Mapbox has some kind of ranking. For example, if I allow 20 results, my search for "Columbus, Ohio" comes back with Bartholomew County IN and Franklin County OH, but if I only allow 1 result it picks Franklin County, which shows it correctly ranked the area there highest. 3.
We also do exact name matching on jurisdictions, but then put the results alphabetically. 4.
We drop the results for states that aren't active. This was really confusing for me coming from Michigan, I had no idea why typing my address didn't work.
Probably changing the UI to focus on the map would be the best thing to do.
But we could also make some changes to the searchbox:
- Encourage the user to type a county instead of an address.
- Only return the top geocoded result, plus any state or jurisdiction matching the user's query exactly (for example, "Minnesota" and "Minnehaha County" for the query "Minn"). This would allow us to still get a result when the user types an address, but not add a ton of confusing low-ranking results.
- Put results that exactly match the query at the top of the list. So if I search for "Franklin", I'll see the various Franklin counties listed before the top geocoded result, Williamson County.
- Include results from states that are inactive, with some kind of note ("Washtenaw county not available yet")
@martymoo https://github.com/martymoo @nayelipelayo https://github.com/nayelipelayo what do you think?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <
https://github.com/developmentseed/work.vote/issues/81#issuecomment-433253300
, or mute the thread <
.
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 <
https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Washington, D.C. 20006 <
https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/developmentseed/work.vote/issues/81#issuecomment-433952280 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ApYEQD_B8vOGq7tbYSHyNI4YAxsK9F7Vks5upx35gaJpZM4WucGV
.
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-433971937, or mute the thread https://github.com/notifications/unsubscribe-auth/ApaOyFjQwwu3mY6BKeYkEB4DhyhRO1zLks5upynrgaJpZM4WucGV .
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Washington, D.C. 20006 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
You guys can just update me. Thanks.
On Mon, Oct 29, 2018 at 12:33 PM nayelipelayo notifications@github.com wrote:
Annie is connecting with Alireza today and afterwards she and I will connect. I'm happy to include you in the conversation or can fill you in. No changes have been made to the search function.
On Mon, Oct 29, 2018 at 12:16 PM jshermancody notifications@github.com wrote:
I don't follow this thread - how does it become "commented out"? But we should speak before any change is made to the search function - I need to approve that.
On Mon, Oct 29, 2018 at 11:25 AM nayelipelayo notifications@github.com wrote:
Hi Annie,
My responses below.
From what you say, it sounds like uncommenting with generate better search results? If that's the case, I think we can do that. I would want to ask Alireza why it was commented out in the first place.
Is there a way to generate less than 20 results and have the results be ranked more relevantly? My concern is that there might be some search entries that have names that can belong to different counties. Like Las Vegas, NM and Las Vegas, NV.
It might be better for us to include the jurisdictions that are inactive and have the landing page have a message that says the state will be available soon. Information for the inactive states is slowly trickling in as election officials fill out the form.
-
- If the address entry is an issue, maybe we can just have that part removed and allow then to enter city, and zip still?
- The geocoded results could work. I think for this part we would want to check with Alireza to ensure he doesn't have have it set as it currently is for a specific reason.
- Putting results that exactly match the query at the top of the list sounds like a good idea.
- Include results from states that are inactive, with some kind of note ("Washtenaw county not available yet") seems like a good idea.
On Thu, Oct 25, 2018 at 9:04 PM anniesullie notifications@github.com wrote:
I took a quick look at the search function. It's hard to dig too much because I can't get local logs working, but I poked at the code and found a couple issues:
1.
The code for state matching is commented out:
https://github.com/developmentseed/api.work.vote/blob/master/apps/api/views.py#L209
If I uncomment and append those results to the response first, when the user types, say "Ala" then "Alabama" and "Alaska" are returned as the first two results. I'm not sure why this is commented out. 2.
The results for jurisdictions are always ordered alphabetically, and we set the limit to 20 results from Mapbox. This ignores any ranking that Mapbox does. and it does look like Mapbox has some kind of ranking. For example, if I allow 20 results, my search for "Columbus, Ohio" comes back with Bartholomew County IN and Franklin County OH, but if I only allow 1 result it picks Franklin County, which shows it correctly ranked the area there highest. 3.
We also do exact name matching on jurisdictions, but then put the results alphabetically. 4.
We drop the results for states that aren't active. This was really confusing for me coming from Michigan, I had no idea why typing my address didn't work.
Probably changing the UI to focus on the map would be the best thing to do.
But we could also make some changes to the searchbox:
- Encourage the user to type a county instead of an address.
- Only return the top geocoded result, plus any state or jurisdiction matching the user's query exactly (for example, "Minnesota" and "Minnehaha County" for the query "Minn"). This would allow us to still get a result when the user types an address, but not add a ton of confusing low-ranking results.
- Put results that exactly match the query at the top of the list. So if I search for "Franklin", I'll see the various Franklin counties listed before the top geocoded result, Williamson County.
- Include results from states that are inactive, with some kind of note ("Washtenaw county not available yet")
@martymoo https://github.com/martymoo @nayelipelayo https://github.com/nayelipelayo what do you think?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <
https://github.com/developmentseed/work.vote/issues/81#issuecomment-433253300
, or mute the thread <
.
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 <
https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Washington, D.C. 20006 <
https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <
https://github.com/developmentseed/work.vote/issues/81#issuecomment-433952280
, or mute the thread <
.
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/developmentseed/work.vote/issues/81#issuecomment-433971937 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ApaOyFjQwwu3mY6BKeYkEB4DhyhRO1zLks5upynrgaJpZM4WucGV
.
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 < https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Washington, D.C. 20006 < https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g
Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-433978965, or mute the thread https://github.com/notifications/unsubscribe-auth/ApYEQG0UreiCQ0W-2TbaXhdAs3yFVFJ0ks5upy3OgaJpZM4WucGV .
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
Sorry I'm late to the game here. Thanks so much @anniesullie for compiling the issues related to the search. Here are a few things that might be helpful:
The code for state matching is commented out: https://github.com/developmentseed/api.work.vote/blob/master/apps/api/views.py#L209
I think this was commented out because Jon Sherman from FEC had asked not to include the states in the results. If it is required again we should include it.
The results for jurisdictions are always ordered alphabetically, and we set the limit to 20 results from Mapbox. This ignores any ranking that Mapbox does. and it does look like Mapbox has some kind of ranking. For example, if I allow 20 results, my search for "Columbus, Ohio" comes back with Bartholomew County IN and Franklin County OH, but if I only allow 1 result it picks Franklin County, which shows it correctly ranked the area there highest.
I think the mapbox default limit is 1 and it only returns maximum of 20 results on with free requests. Were you able to get more than 20 results from the mapbox API?
We drop the results for states that aren't active. This was really confusing for me coming from Michigan, I had no idea why typing my address didn't work.
This is a good point but I think this was done this way because FEC didn't want to show results from states that are not active. We can either show a better feedback to the user (e.g. state is not covered) or just activate them.
Zip codes are not 1:1 with counties, and so returning just one result will result in cases where a user searches their zip code and does not receive the correct jurisdiction. Also, we talked in August/September about the use-case of searching a common city or county name, and it was decided that the website should return at least the matching jurisdictions for which we have data.
I think we could likely address this problem by making an effort to preserve the Mapbox ranking; right now the code simply pulls the jurisdictions from the database, orders them by name, and then retains any that intersect with the user's search query. See https://github.com/developmentseed/api.work.vote/blob/9d60ba756c3a00ecb1703b3377554087829da219/apps/api/views.py#L69
And @nayelipelayo @anniesullie to answer your questions re: the disabling of state search, this was requested in https://github.com/developmentseed/work.vote/issues/82
Thanks, Alyssa. This makes sense. I didn't realize Jon had requested #82.
On Mon, Oct 29, 2018 at 4:47 PM Alyssa Harris notifications@github.com wrote:
Zip codes are not 1:1 with counties, and so returning just one result will result in cases where a user searches their zip code and does not receive the correct jurisdiction. Also, we talked in August/September about the use-case of searching a common city or county name, and it was decided that the website should return at least the matching jurisdictions for which we have data.
I think we could likely address this problem by making an effort to preserve the Mapbox ranking; right now the code simply pulls the jurisdictions from the database, orders them by name, and then retains any that intersect with the user's search query. See https://github.com/developmentseed/api.work.vote/blob/9d60ba756c3a00ecb1703b3377554087829da219/apps/api/views.py#L69
And @nayelipelayo https://github.com/nayelipelayo @anniesullie https://github.com/anniesullie to answer your questions re: the disabling of state search, this was requested in #82 https://github.com/developmentseed/work.vote/issues/82
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-434073572, or mute the thread https://github.com/notifications/unsubscribe-auth/ApaOyFTJqMNQwRCV6zWVzZhPyp-DMciMks5up2lwgaJpZM4WucGV .
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Washington, D.C. 20006 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
I talked to @nayelipelayo offline, and I think we want to make the following changes:
I did some digging into how mapbox ranks its results, and it's not clear to me if the API returns anything we could sort on. There is a "confidence" field. The documentation says it should be a decimal number from 0 to 1, but I'm seeing integers. Example of output when searching for "Boston":
[Boston, Massachusetts, United States]: confidence 4
[Boston Logan International Airport, 1 Harborside Dr, Boston, Massachusetts 02128, United States]: confidence 0
[Bostonia, El Cajon, California 92021, United States]: confidence 6
[Boston Road, Springfield, Massachusetts 01129, United States]: confidence 7
[Boston Common, Tremont St, Boston, Massachusetts 02108, United States]: confidence 0
I would expect a higher confidence number to be for the best results, and a lower number for the worst. But Boston Road and Bostonia in El Cajon have the highest numbers and they look like the worst results to me. "Boston, Massachusetts" is what I would call the best result, and it's confidence number 4 is right in the middle. If I restrict to just one result, "Boston, Massachusetts" is the answer it gives, which makes me think it's the highest rank, but I can't find any field in the results that would sort it that way. There is an "accuracy" field which always seems to be null, and a "quality" field which always seems to be 1.
So I am pretty stumped on how to sort the geocoded results.
Hi Annie, Just wanted to confirm that I saw this comment. I'm not sure either. Maybe @alyssadelaine or @scisco may know?
Those objectives all sound good to me.
Would Google Maps perform better than Mapbbox on this ranking?
On Tue, Oct 30, 2018 at 10:53 PM anniesullie notifications@github.com wrote:
I talked to @nayelipelayo https://github.com/nayelipelayo offline, and I think we want to make the following changes:
- Sort so that jurisdictions that are an exact match to what's been typed show up first
- Keep multiple geocoded results, and rank them the same way mapbox ranked them
- Include jurisdictions that have no data in the results, but improve messaging in the UI (#114 https://github.com/developmentseed/work.vote/pull/114)
I did some digging into how mapbox ranks its results, and it's not clear to me if the API returns anything we could sort on. There is a "confidence" field. The documentation https://www.mapbox.cn/api-documentation/#match-object says it should be a decimal number from 0 to 1, but I'm seeing integers. Example of output when searching for "Boston":
[Boston, Massachusetts, United States]: confidence 4 [Boston Logan International Airport, 1 Harborside Dr, Boston, Massachusetts 02128, United States]: confidence 0 [Bostonia, El Cajon, California 92021, United States]: confidence 6 [Boston Road, Springfield, Massachusetts 01129, United States]: confidence 7 [Boston Common, Tremont St, Boston, Massachusetts 02108, United States]: confidence 0
I would expect a higher confidence number to be for the best results, and a lower number for the worst. But Boston Road and Bostonia in El Cajon have the highest numbers and they look like the worst results to me. "Boston, Massachusetts" is what I would call the best result, and it's confidence number 4 is right in the middle. If I restrict to just one result, "Boston, Massachusetts" is the answer it gives, which makes me think it's the highest rank, but I can't find any field in the results that would sort it that way. There is an "accuracy" field which always seems to be null, and a "quality" field which always seems to be 1.
So I am pretty stumped on how to sort the geocoded results.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-434543025, or mute the thread https://github.com/notifications/unsubscribe-auth/ApYEQC939lksA21d_KtVUnH922feK0dQks5uqRCRgaJpZM4WucGV .
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
And thank you so much for helping us improve the site!
On Wed, Oct 31, 2018 at 9:51 AM Jon Sherman < jsherman@fairelectionscenter.org> wrote:
Those objectives all sound good to me.
Would Google Maps perform better than Mapbbox on this ranking?
On Tue, Oct 30, 2018 at 10:53 PM anniesullie notifications@github.com wrote:
I talked to @nayelipelayo https://github.com/nayelipelayo offline, and I think we want to make the following changes:
- Sort so that jurisdictions that are an exact match to what's been typed show up first
- Keep multiple geocoded results, and rank them the same way mapbox ranked them
- Include jurisdictions that have no data in the results, but improve messaging in the UI (#114 https://github.com/developmentseed/work.vote/pull/114)
I did some digging into how mapbox ranks its results, and it's not clear to me if the API returns anything we could sort on. There is a "confidence" field. The documentation https://www.mapbox.cn/api-documentation/#match-object says it should be a decimal number from 0 to 1, but I'm seeing integers. Example of output when searching for "Boston":
[Boston, Massachusetts, United States]: confidence 4 [Boston Logan International Airport, 1 Harborside Dr, Boston, Massachusetts 02128, United States]: confidence 0 [Bostonia, El Cajon, California 92021, United States]: confidence 6 [Boston Road, Springfield, Massachusetts 01129, United States]: confidence 7 [Boston Common, Tremont St, Boston, Massachusetts 02108, United States]: confidence 0
I would expect a higher confidence number to be for the best results, and a lower number for the worst. But Boston Road and Bostonia in El Cajon have the highest numbers and they look like the worst results to me. "Boston, Massachusetts" is what I would call the best result, and it's confidence number 4 is right in the middle. If I restrict to just one result, "Boston, Massachusetts" is the answer it gives, which makes me think it's the highest rank, but I can't find any field in the results that would sort it that way. There is an "accuracy" field which always seems to be null, and a "quality" field which always seems to be 1.
So I am pretty stumped on how to sort the geocoded results.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-434543025, or mute the thread https://github.com/notifications/unsubscribe-auth/ApYEQC939lksA21d_KtVUnH922feK0dQks5uqRCRgaJpZM4WucGV .
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
I think @scisco told me that they're not using Google Maps result because the free version only provides one result, and there was a request for multiple results.
@anniesullie This is correct. But we can switch to the paid version of the API and hope the requests stay within the monthly $250 credit that Google provides or figure out a way to get a higher credit from Google.
@anniesullie @scisco It's not clear to me that we need to sort the Mapbox results; have we checked to see whether they come in the "correct" order straight from the API? It seems to me that they might if restricting to fewer results returns the "best" results as far as we're concerned
To ask a stupid question, higher confidence number is higher confidence yes? It's not ranking it 1, 2, 3 ,.....right?
On Wed, Oct 31, 2018 at 10:00 AM Alyssa Harris notifications@github.com wrote:
@anniesullie https://github.com/anniesullie @scisco https://github.com/scisco It's not clear to me that we need to sort the Mapbox results; have we checked to see whether they come in the "correct" order straight from the API? It seems to me that they might if restricting to fewer results returns the "best" results as far as we're concerned
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-434697729, or mute the thread https://github.com/notifications/unsubscribe-auth/ApYEQMB-w5jB-pdr4KQDxhpgJpd_G8cdks5uqaz0gaJpZM4WucGV .
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
I'm not sure what you mean @jshermancody. I think @anniesullie is saying that the confidence numbers seem to indicate something else besides "confidence in this result matching the query". But it seems that Mapbox has some sort of ranking system which ranks the results in terms of how close they are to the query.
So @anniesullie in the example you gave above, Boston, Massachusetts
came first, which to us is the best result, right? So we should just keep the results in the same order as they are in when Mapbox sends them, which for Boston would be
[Boston, Massachusetts, United States]: confidence 4
[Boston Logan International Airport, 1 Harborside Dr, Boston, Massachusetts 02128, United States]: confidence 0
[Bostonia, El Cajon, California 92021, United States]: confidence 6
[Boston Road, Springfield, Massachusetts 01129, United States]: confidence 7
[Boston Common, Tremont St, Boston, Massachusetts 02108, United States]: confidence 0
So rather than try to sort them, I'm saying I think we can just keep the order as is. Does that make sense, or am I missing something?
@alyssadelaine yeah, I think this is just what happens when I try to work too much after the kids go to bed :)
I'll double check a few more queries, but you're right that I listed the order the results came in, and it seems to be ranked pretty well.
So going back to the code, I'm not too familiar with python3 or django models. I think what I would do here is return a list of jurisdictions instead of a query result, and just list concatenate it after the list of jurisdictions which exactly match. Sound reasonable? (Right now things are using query ordering to sort, but I think I'm adding too much complexity to keep doing that)
So the plan is to:
Duplicate jurisdictions will only be listed the first time they appear.
Is that okay?
@anniesullie That sounds right to me! As far as the code goes, I think the key is that jurisdictions are ordered by name here and then filtered based on whether they intersect any of the regions here which is why the correct jurisdictions are returned, just ordered alphabetically.
I think the way to fix this is to separate the different geometries as opposed to checking them all at once. So instead of
multipoints = MultiPoint([GEOSGeometry(item.wkt) for item in geocoded])
return jurisdictions.filter(geometry__intersects=multipoints)
You would do something like:
jurisdictions_to_return = []
for item in geocoded:
jurisdictions_to_return.append(jurisdictions.filter(geometry__intersects=GEOSGeometry(item.wkt)))
return jurisdictions_to_return
Which I think will preserve the rank
The changes here are now live on the site. @nayelipelayo @jshermancody take a look!
This looks great @anniesullie 👏 💯
I haven't had a chance to look much since I'm working the election protection hotline but from the little I did see, it looks wonderful. Thank you so much, Annie!
On Mon, Nov 5, 2018 at 10:56 AM Alireza notifications@github.com wrote:
This looks great @anniesullie https://github.com/anniesullie 👏 💯
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-435927254, or mute the thread https://github.com/notifications/unsubscribe-auth/ApYEQEg5wOjW0ilN3c7ZuTnc1jN2oADAks5usF-6gaJpZM4WucGV .
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
Yes, looks very, very nice.
On Mon, Nov 5, 2018 at 12:35 PM jshermancody notifications@github.com wrote:
I haven't had a chance to look much since I'm working the election protection hotline but from the little I did see, it looks wonderful. Thank you so much, Annie!
On Mon, Nov 5, 2018 at 10:56 AM Alireza notifications@github.com wrote:
This looks great @anniesullie https://github.com/anniesullie 👏 💯
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/developmentseed/work.vote/issues/81#issuecomment-435927254 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ApYEQEg5wOjW0ilN3c7ZuTnc1jN2oADAks5usF-6gaJpZM4WucGV
.
-- Jon Sherman Senior Counsel Fair Elections Center 1825 K Street NW, Suite 450 Washington, D.C. 20006 Phone: (202) 248-5346 jsherman@fairelectionscenter.org www.fairelectionscenter.org
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/developmentseed/work.vote/issues/81#issuecomment-435963979, or mute the thread https://github.com/notifications/unsubscribe-auth/ApaOyDsh13srI7oAoxOdkzi4XWLnPBoZks5usHbdgaJpZM4WucGV .
-- Nayeli Pelayo Outreach Manager Fair Elections Center 1825 K Street NW, Suite 450 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Washington, D.C. 20006 https://maps.google.com/?q=1825+K+Street+NW,+Suite+450+Washington,+D.C.+20006&entry=gmail&source=g Phone: (202) 248-5351 npelayo@fairelectionscenter.org www.fairelectionscenter.org
Thanks, everyone! Closing this issue as resolved.
The search function still seems slow, slower the more results it has to generate. One thing that seems to increase slowness is writing out Ohio as opposed to OH. We need to make it so that it understands the comma in <Columbus, Ohio> or <Columbus, OH> to mean that the person is trying to narrow it down to Columbuses in one state, Ohio, so it doesn't start independently searching for Ohio everywhere. It should be an AND not an OR. Thanks very much!