mosuka / phalanx

Phalanx is a cloud-native distributed search engine that provides endpoints through gRPC and traditional RESTful API.
Apache License 2.0
358 stars 27 forks source link

Support Google Cloud Storage as Index Store #18

Open mosuka opened 2 years ago

petergeorgas commented 2 years ago

I am a bit new with Golang, but find a task like this interesting. @mosuka in your opinion, how big of an undertaking is this ticket? I have little experience with GCP but this feels like a great opportunity to get more comfortable with Go and GCP...

mosuka commented 2 years ago

@petergeorgas Thank you for your interest. I think it is not so difficult. You can see the S3 implementation for details.

https://github.com/mosuka/phalanx/blob/main/clients/s3.go https://github.com/mosuka/phalanx/blob/main/directory/directory_s3.go

I'm planning to implement and integration test it first using fake-gcs-server. https://github.com/fsouza/fake-gcs-server

I am looking forward to your pull request.

petergeorgas commented 2 years ago

Thank you for your interest. I think it is not so difficult. You can see the S3 implementation for details.

https://github.com/mosuka/phalanx/blob/main/clients/s3.go https://github.com/mosuka/phalanx/blob/main/directory/directory_s3.go

I'm planning to implement and integration test it first using fake-gcs-server. https://github.com/fsouza/fake-gcs-server

I am looking forward to your pull request.

How do you want to deal with credentials?

https://pkg.go.dev/cloud.google.com/go?utm_source=godoc#hdr-Authentication_and_Authorization

mosuka commented 2 years ago

At first, I think we can use Google Application Default Credentials (ADC). I want to read the specified credential file if the environment variable GOOGLE_APPLICATION_CREDENTIALS is set.

In Phalanx, each resource is expressed by a URI. I want GCS to be the following URI.

gs://<BUCKET_NAME>/<PATH_TO_INDEX_DIR>/<INDEX_NAME>

I want to be able to specify credentials in the URI parameter as well as in other storage.

gs://<BUCKET_NAME>/<PATH_TO_INDEX_DIR>/<INDEX_NAME>?application_credentials=<PATH_TO_APPLICATION_CREDENTIALS>

You can refer to the S3 documentation here.

https://github.com/mosuka/phalanx/blob/main/docs_md/index_store.md#amazon-s3