untoldone / bloomapi

Create APIs out of public datasources
https://www.bloomapi.com/documentation/public-data
MIT License
89 stars 29 forks source link

Searching for providers by geocoded location #48

Closed etagwerker closed 7 years ago

etagwerker commented 10 years ago

Hello,

I am interested in contributing code for this feature. I am working on a healthcare project, which needs to run queries like "give me all the doctors in a 10 mile radius of this address".

I see that the information to geocode the location is in the database. I would need to process the entries in the database to store latitude & longitude per address.

I was wondering a few things:

  1. Should I create a new project (eg. bloomapi-geolocation) and add my new code there?
  2. If not, should I just add a few tasks to add the necessary columns and populate it using a geocoding tool like https://npmjs.org/package/node-geocoder?

Let me know what you had in mind.

Thanks!

marks commented 10 years ago

@etagwerker - sounds like a great idea (I will let @untoldone chime in about whether it should go in this repo or not, my vote would be for it to be a PR and be combined into this repo).

As someone who has worked with BloomAPI and NPPES data for projects, just know that the data is known to not be top-notch (OIG report at http://oig.hhs.gov/oei/reports/oei-07-09-00440.asp).

I'm also curious which geocoding API or system you plan to use that will let you hit it millions of times in a short period of time. You might look into setting up your own instance of DSTK (http://www.datasciencetoolkit.org/) though I've been using http://developer.decarta.com/ recently as I find the accuracy to be better than DSTK.

untoldone commented 10 years ago

This would be awesome + has very clear value. I think whether it should be its own project or included depends on its complexity/ impact on the deployment of a BloomAPI instance.

A few thoughts on how I had thought to approach this as an inclusion in the existing project (though is by no means the only way): 1) Add a separate table of geocoded locations 2) Add a second collection source to be run on a regular interval 3) Either augment the code that projects queries to handle a join (currently really messy code), update the existing endpoint to handle the join as a special case, or write a second endpoint specifically for location based lookup

I'd also be up for chatting about this offline if you're interested

etagwerker commented 10 years ago

@marks Thanks! That's definitely helpful. I'll take that into account when I write the geocoding logic. I'm going to add a default for the geocoding service and I will allow users to use other services (supported by https://npmjs.org/package/node-geocoder)

@untoldone Thanks for your input. It's definitely useful. I was wondering if you could help me writing the specs for the geocoding tool. I still haven't been able to bootstrap the API in my local environment. (I am running low on disk space)

Is there a way to just bootstrap the initial file? (without the updates)

Another alternative would be if you added a mock NPI record to the spec folder, that I could use for testing.

Could you help me with that?

untoldone commented 10 years ago

Sure -- mock NPI records make sense to make dev easier without needing to run the full import.

I've also got an idea on how to improve the import performance. Should be able to get the import time down (however, this wont really help in terms of disk space).

aegixx commented 9 years ago

This would be huge for my project - but noticed the PR got shut down. Any updates? I'm not a GO developer, but willing to give it a go (pun intended) if push (another pun) comes to shove... ;)

untoldone commented 9 years ago

The pull no longer made sense since the language changed. That said, BloomAPI has been built to pull in many different arbitrary datasources and put them all together for the API.

If someone made a new datasources that was just something like a CSV of geo-coded NPI addresses, that could very easily be re-included. This wouldn't need to be written in Go. I'd be happy to help make it happen on the API side quickly once the data is there.

etagwerker commented 9 years ago

I would be interested in porting this code to Go, but I don't have much time these days. :(

untoldone commented 9 years ago

@etagwerker @aegixx Any chance you think it could just be a script that uploaded a CSV to somewhere public based on the NPI input file with the practice address + business address fields? I'd take care of the rest sometime soon if you found time for it!

afs2015 commented 9 years ago

@untoldone, are you just looking to get all of the addresses in the NPI database but by the longitude/latitude into a CSV file?

untoldone commented 9 years ago

@afs2015 Yep -- if we just had a source of coded locations (that updates regularly with the latest NPI locations), I'd be able to easily pull that into the NPI + make it searchable.

untoldone commented 7 years ago

Likely won't get fixed anytime soon. At BloomAPI, we're now internally geocoding using Nominatim but don't currently publish that source anywhere ... but could if needed. Feel free to reopen if anyone feels particularly strongly about this getting added.