Open falk-stefan opened 2 years ago
You could try using the dockerized
version if GCP
allows.
Also, I guess GCP
should have access to some temporary file systems, so if u pass those paths it should work
Hi and thanks for the quick response!
The problem with Docker here is cost. I want to keep cost down if possible. I think the cheapest solution would be using Cloud Run plus Google Storage.
The temporary file system is limitted to 8GB so that's not an option as well unfortunately.
So, maybe that's a feature request? Make AnnLite flexible enough to run with GCP or AWS buckets?
We are trying to make some optimizations in term of space, but not sure it will be enough. How many documents do u expect to index? how much data do you use? Maybe u can use another type of Indexer
that may keep them in memory?
I don't know for sure yet but it's going to be in the tens of millions. Keeping it in memory is probably not feasable in this case. However, I figured that I'll probably have to go with a dockerized + volume mount approach. Cloud Run is stateless so it's probably not what I want after all.
Speaking of Indexer.. would you say that PQLiteIndexer
is the weapon of choice here? It looks neat to me beacuse I am going to have meta data which should allow me to filter before running the vector-based search.
Yes, AnnLiteIndexer
is a good weapon of choice. (Please note that PQLiteIndexer
was renamed to ANNLiteIndexer
and the proper Executor being updated is AnnLiteIndexer
. The good thing is that many of these indexers can be replaced easily as a plug-n-play
@falk-stefan Hi, Nicholas from Jina AI here. I'd love to set up a chat with you to learn more about your use case and how we can help. Are you in our community Slack channel? Or is there a more convenient way I can get in touch with you?
Hi!
I am currently taking a look at jina-ai. The plan is to get a simple text-based document search going and so far I've managed to make a simple demo locally which uses the
PQLiteIndexer
(based on AnnLite).The next step would be for me to see how I can deploy a prototype to Google Cloud Platform (GCP) and, if possible, use Cloud Run in order to keep costs at minimum.
However, since AnnLite requires access to a local file-system I am not sure if that's possible. I intended to use Cloud Storage but it seems AnnLite would not support this.
What options do I have here?