Closed stsievert closed 4 years ago
I debated using Dask or Ray Actors for this. I chose not to because using an actor would mean that the search is always using the current model, even if it's being updated concurrently. That means we'll have to serialize the model at least once. Luckily this doesn't take too long, 84μs for 100k answers:
If the number of answers is 20k, the serialization time is 50μs. The object is 615KB, so it's not huge.
From https://github.com/stsievert/salmon/issues/35#issuecomment-621483808:
Client(asynchronous=True)
.
expose
not ports
).
8787
is exposed to the local machine with Docker. The documentation is not modified, so EC2 machines will still have port 8787 closed.This still provides all the features in https://github.com/stsievert/salmon/issues/35#issuecomment-668657038.
What does this PR implement? It runs the query search and model updates in parallel. Functionally, it implements this code:
This PR also does the following:
Reference issues/PRs This PR will close #61 and will close #35.