Closed jdavisp3 closed 7 years ago
I'd actually suggest making the reverse change in bloomnpi
as that's the way it works in our hosted / production version. Let me take a closer look at the whole project today to make sure everything's still working ok.
Just curious btw, what are you going to be using Bloom with?
Thanks, I'll give that a try! We (Counsyl) have been using an older version internally for some time, just looking to bring things up to date.
Looks like it works OK to import data with this change -- I'll check , but the datasource itself doesn't update the /api/sources list ... this means you'll have to search with /api/search
and find specific listings with /api/npis/:npi
for now over using /api/search/usgov.hhs.npi
and /api/sources/usgov.hhs.npi/:npi
. I pushed this update just now with https://github.com/gocodo/bloomnpi/commit/e743a5d78b3f964822df872b718e03a1eb9ddf4b.
As an aside, I decided to open source our production versions of this code over the weekend which includes many other datasources in addition to the NPI. This code does maintain the sources list correctly. I'm going to be putting a little more time into this this week to write some better documentation for it. You can find this at https://github.com/bloomapi/datasources.
Thanks very much!
Just pushed and tested some versions that make getting a running copy with an updated copy of the NPI and other datasources via docker significantly easier. E.g.
docker-compose up -d
on a machine with docker (where Docker gets at least 4GB of memory)I'm going to go ahead and close this issue, but let me know if this is enough for you to get going with or if you have any other questions.
Thank you very much!!
Regarding that 4GB for Docker -- how should that be divided up amongst the different containers? I assume the elasticsearch container needs the most?
In a prod environment, the more memory you feed ES the better (at least 2GB) -- PG needs the second most (at least 512MB-1G but would likely want more). The other containers should be relatively lean. Just as a warning, while it works with 4GB, I haven't tested it rigorously at this point and its possible 4GB isnt enough for a prod environment. For context, we used to run ES on a cluster of 3 machines each with 4GB of memory reserved for ES (12GB total). We recently moved to a single host with 16GB. If you are in a cloud environment, more memory for the DB will also dramatically increase the load times for the datasets as PG can put everything into memory/ cache rather than having to load everything from disk which can be super slow to load otherwise (we're talking 15 minutes vs 6+ hours).
That makes sense, thanks very much!
I've been trying to get a local instance of
bloomapi
running by following the instructions here. But the elasticsearch index created bybloomnpi
seems to be different than that expected bybloomapi
. I think I've gotten it to work by applying this patch:The change to
bloomapi
seems to have happened after the currentbloomnpi
. Is there a missingbloomnpi
version?