Closed TJMaria closed 4 months ago
🤖
Hello,
To configure the Google Vertex AI Matching Engine in your NodeJs app deployed in project A to locate the indexEndpoint in a different project, project B, you need to ensure that the service account used for authentication in project A has the necessary permissions to access the resources in project B.
This can be achieved by adding the service account of project A to project B with the appropriate roles. Once this is done, you can specify the index and indexEndpoint of the Matching Engine in project B when initializing the MatchingEngine
object in your NodeJs app.
Here's how you can do it:
Add the service account of project A to project B with the necessary roles. This can be done in the Google Cloud Console.
In your NodeJs app, when initializing the MatchingEngine
object, specify the index and indexEndpoint of the Matching Engine in project B.
const config: MatchingEngineArgs = {
index: process.env.GOOGLE_VERTEXAI_MATCHINGENGINE_INDEX!, // Index in Project B
indexEndpoint: process.env.GOOGLE_VERTEXAI_MATCHINGENGINE_INDEXENDPOINT!, // Index Endpoint in Project B
apiVersion: "v1",
docstore: store,
authOptions: {
keyFilename: '/path/to/keyfile.json', // Path to the service account key file
scopes: ['https://www.googleapis.com/auth/cloud-platform'] // Scopes for the service account
}
};
const engine = new MatchingEngine(embeddings, config);
In the above code, GOOGLE_VERTEXAI_MATCHINGENGINE_INDEX
and GOOGLE_VERTEXAI_MATCHINGENGINE_INDEXENDPOINT
should be set to the index and indexEndpoint of the Matching Engine in project B. The authOptions
should be set with the path to the service account key file and the necessary scopes.
Please note that the service account used for authentication should have the necessary permissions to access the resources in both project A and project B.
You can find more information about this in the Google Vertex AI integration documentation and the Google Vertex AI Matching Engine code in the langchainjs repository.
I hope this helps! If you have any other questions, feel free to ask.
Best, Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
🤖
Hello,
Based on your code snippet and the information you provided, here are the answers to your questions:
Yes, you do need to pass the service account key file in the authOptions
when initializing the MatchingEngine
object. This is because the MatchingEngineArgs
interface extends GoogleVertexAIConnectionParams<GoogleAuthOptions>
, and the GoogleAuthOptions
is used in the GoogleAuth
constructor in each of the connection classes (IndexEndpointConnection
, RemoveDatapointConnection
, UpsertDatapointConnection
, FindNeighborsConnection
). This is how the Google Cloud SDK authenticates your requests.
The index
and indexEndpoint
are only the IDs, not the entire path including the projectID. This is evident from the buildUrl
methods in the connection classes where the projectId
is fetched separately and then combined with the index
or indexEndpoint
to form the full URL.
The service account will not by default use the correct project. The projectId
is fetched using the getProjectId
method of the GoogleAuth
client, which will return the project ID that the service account is associated with. If you want to use a different project, you would need to use a service account that is associated with that project, or manually specify the project ID when forming the URL.
Please note that the above information is based on the current implementation of the MatchingEngine
and related classes in the LangChain Javascript framework. If you encounter any issues or have further questions, feel free to ask.
Best regards.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
@dosu-bot Ok, so I HAVE TO pass the service account key file? Linking service account in the GCP project is not enough?
And the index and indexEndpoint are only the ID's correct? Not the entire path including projectID? The service account should be default use the correct project?
🤖
Hello,
To configure the Google Vertex AI Matching Engine to locate the indexEndpoint in a different project, Project B, you need to ensure that the service account linked to your Node.js application has the necessary permissions to access resources in Project B.
You can pass the service account key file in the authOptions
when creating an instance of the MatchingEngine
class. The authOptions
should include the path to the service account key file.
The index
and indexEndpoint
should be specified in the format projects/{project}/locations/{location}/indexes/{index}
and projects/{project}/locations/{location}/indexEndpoints/{indexEndpoint}
respectively.
However, the MatchingEngine
class in the langchainjs codebase does not accept a projectId
in the authOptions
to override the default project used by the service account. The projectId
is retrieved from the credentials
object within the getProjectId
method of the WebGoogleAuth
class. There is no provision in the provided code to override the projectId
using authOptions
.
Here is an example of how you can do this:
import { MatchingEngine } from 'langchain/src/vectorstores/googlevertexai.ts';
import { Embeddings } from 'langchain/src/embeddings/base.js';
import { Docstore } from 'langchain/src/schema/index.js';
// Initialize your embeddings and docstore
const embeddings = new Embeddings(/* parameters */);
const docstore = new Docstore(/* parameters */);
// Specify the authOptions with the service account key file
const authOptions = {
keyFilename: '/path/to/service-account-key.json', // Path to the service account key file
};
// Specify the index and indexEndpoint
const index = 'projects/project-b/locations/location/indexes/index';
const indexEndpoint = 'projects/project-b/locations/location/indexEndpoints/indexEndpoint';
// Initialize the MatchingEngine
const matchingEngine = new MatchingEngine(embeddings, {
docstore,
authOptions,
index,
indexEndpoint
});
In this example, replace /path/to/service-account-key.json
with the path to your service account key file, project-b
with the ID of Project B, location
with the location of your resources, index
with the ID of your index, and indexEndpoint
with the ID of your index endpoint.
In the context shared, the GoogleVertexAI
class in the langchain/src/llms/googlevertexai/index.ts
file also uses the GoogleAuth
class from the google-auth-library
package to authenticate requests to the Vertex AI API. This class accepts an options object that can include the keyFilename
and projectId
properties, similar to the MatchingEngine
class. This means that you can use the same authOptions
object to authenticate requests to the Vertex AI API when using the GoogleVertexAI
class.
Please let me know if you have any other questions.
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
AFAIK, Vertex AI store in langchainjs is a pure Node.js implementation, which means that it depends on the google-auth-library package. By default, the google-auth-library package uses the service account for the resource that the Node.js application is deployed on.
There are two ways to override the default service account:
@kevsjh Thank for your input, I also spotted your issue.
When working locally I was using the gcloud cli to authenticate (using my own acount):
gcloud auth application-default login
But now that I'm struggling to get it working in the deployed App, i decided to try it locally with a keyfile from the service account of project B (where the indexEndpoint is located). But I also can't seem to get that to work.
I'm using a json keyfile path in GOOGLE_APPLICATION_CREDENTIALS.
I get a 401 error, even if the service account is has the role of owner:
code: 401,
message: 'Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.',
errors: [
{
message: 'Invalid Credentials',
domain: 'global',
reason: 'authError',
location: 'Authorization',
locationType: 'header'
}
],
status: 'UNAUTHENTICATED'
}
Only when passing a keyfile from my personal account, generated by gcloud auth application-default login
does GOOGLE_APPLICATION_CREDENTIALS work for me.
@kevsjh
When i try using the cross-account approach, the request is done in the wrong project.
So then i get a 403 because the resource doesn't exist
As far is I can see, the authOptions are not being passed to the indexEndpointClient from the MatchingEngine constructor: https://github.com/langchain-ai/langchainjs/blob/4f56cf7ceb7f1abfc3f903c9d8a9bdb0c3d5b176/langchain/src/vectorstores/googlevertexai.ts#L389
That creates a client with new (empty) authOptions: https://github.com/langchain-ai/langchainjs/blob/4f56cf7ceb7f1abfc3f903c9d8a9bdb0c3d5b176/langchain/src/vectorstores/googlevertexai.ts#L65
It seems that in the updated version, google auth is passed in to the class correctly but both google auth docs and langchainjs doc on vertex ai authentication seems a little lacking.
The below should be the correct parameters to pass in.
const engine = new MatchingEngine(embeddings, {
index: process.env.GOOGLE_VERTEXAI_MATCHINGENGINE_INDEX!,
indexEndpoint: process.env.GOOGLE_VERTEXAI_MATCHINGENGINE_INDEXENDPOINT!,
apiVersion: "v1beta1",
docstore: store,
location: 'us-central1',
authOptions: {
scopes: ['https://www.googleapis.com/auth/cloud-platform'],
projectId: process.env.GOOGLE_APPLICATION_PROJECT_ID,
credentials: {
type: 'service_account',
"client_email": process.env.GOOGLE_APPLICATION_CLIENT_EMAIL,
"private_key": process.env.GOOGLE_APPLICATION_PRIVATE_KEY
}
},
});
I have tested the above in nextjs-14 api app router on my local machine with both gcloud auth logout, and also gcloud auth logged in but set to different project-id to validate this to ensure the service account impersonation is invoked, so this should work in production as well by setting the correct env.
Another important note was that for private_key environment var, it should be encapsulated in double quote "" instead of single quote ''. You will get an error Error: error:1E08010C:DECODER routines::unsupported
if the env is single quoted.
So the correct env is
GOOGLE_APPLICATION_PRIVATE_KEY="-----BEGIN PRIVATE KEY .... -----END PRIVATE KEY-----\n"
I ended up stepping away from LangchainJs and implementing it myself. @kevsjh Thanks for taking a look but it can't confirm if this would solve the issues i had.
Hi, @TJMaria,
I'm helping the langchainjs team manage their backlog and am marking this issue as stale. It seems like you were experiencing difficulty configuring the Google Vertex AI Matching Engine to look for the indexEndpoint in a different project than the NodeJs app. Despite several responses providing guidance on how to configure the Matching Engine, it appears that you ultimately decided to implement the solution independently.
Could you please confirm if this issue is still relevant to the latest version of the langchainjs repository? If it is, please let the langchainjs team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!
I'm running into an issues using:
I am deploying a NodeJs app in Google cloud, which should have the right service account and premissions. Let's say my NodeJs app is deployed in project A.
My indexEndpoint is deployed in a different project, project B.
I can't seem to find a way to force the MatchingEngine to look for the indexEndpoint in project B. Only place i can pass the projectId is through the authOptions, but that doesn't seem to work.