bcanseco / warbler

🐤 Auto-generated, geolocation based chatrooms for universities. [no longer maintained]
https://borja.io/warbler
1 stars 0 forks source link

Improve university filtering #26

Open bcanseco opened 7 years ago

bcanseco commented 7 years ago

Sadly, universities are not always properly tagged by Google. This means that the Google Maps API will sometimes return universities tagged as "school" but not "university", or sometimes multiple results for universities (e.g. buildings, departments, etc).

My naive filter is leaving out actual universities in the area. Maybe we should have a whitelist with a corresponding reporting system? Or maybe there are more tweaks we can do to the filtering to yield more (actual) universities, and not high schools/department buildings/etc.

https://github.com/bcanseco/warbler/blob/2e08aa22428e4f060d998f14febc7752a4f7ced1/src/Warbler/Services/ProximityService.cs#L123-L129

AC:

XiaohanYang commented 6 years ago

For current development, I think we don't need to build a whitelist, which is probably the duty of certification body. But we can improve the searching accuracy by the secondary screening. Steps may like below:

  1. Call Google Geocoding
  2. Secondary screening - set some shields, won't show any result with keyword like "high school", "library", etc.
bcanseco commented 6 years ago

Secondary screening is probably a good idea. It's already working pretty well for departments. I think the biggest problem is the u.Is(PlaceLocationType.<something> filtering. We just can't rely on those to be accurate. Maybe best to make that step as lenient as possible (accept results that google tags as schools or libraries or universities), but then filter out actual libraries/schools via keywords like we do with departments.

bcanseco commented 6 years ago

Still needs work. The 50 hard coded select queries is not a good way of solving this. Leaving it open until I can get around to figuring out a better solution.