Closed geohacker closed 1 year ago
@willemarcel and I were able to look at this together a bit today and like @sunu found the other day, this happens when the Milvus querynodes hits OOM.
We will make the requests from the frontend not drastically high but in the meantime, @sunu let's figure out optimising / increasing resources on the cluster.
Last week I bumped up Milvus querynode's memory limit from 12GB to 15GB to avoid OOMs. But looks like we need even more memory.
@geohacker @willemarcel Do we have an estimate of how much memory it should need? If not, we can try bumping up querynode's memory limit really high and then run some memory intensive queries to observe how much memory it consumes. That should give us a good idea about what the ideal limit should be.
This is now resolved through better scaling and resource management for Milvus.
When there are many requests from the frontend at the same time, I'm seeing this in the API logs:
That looks like it's coming from https://github.com/developmentseed/bioacoustics-api/blob/main/bioacoustics/milvus/views.py#L77
cc @willemarcel