langchain-ai / langchain-google

MIT License
105 stars 121 forks source link

Vertex AI [Feature] support Part.from_uri for non-image files #159

Closed stevenaldinger closed 1 month ago

stevenaldinger commented 5 months ago

I'm happy to contribute to the repo btw after getting validation that this makes sense as a feature.

I want to be able to upload text files once and be able to reference them.

Part.from_uri is only referenced in image utils here: libs/vertexai/langchain_google_vertexai/_image_utils.py#L80-L97

but in the vertexai module, that supports arbitrary mime types like the following example.

import vertexai
from vertexai.generative_models import GenerativeModel, Part

gemini_model = GenerativeModel(model_name="gemini-1.5-pro-preview-0409")

model_response = gemini_model.generate_content([
   "summarize the readme",
   Part.from_uri("gs://example-bucket/some-directory/README.md", "text/markdown"),
])

print(model_response)
lkuligin commented 5 months ago

sure, please, feel free to contribute. that would be a great addition!

tdigangi commented 3 months ago

It does look like this commit fixed the issue opened here

You can only reference other media types via a gcs bucket, but it does work.

from langchain_google_vertexai import ChatVertexAI
import pprint

llm = ChatVertexAI(
    model_name="gemini-1.5-flash-001",
    temperature=0,
    max_tokens=None,
    max_retries=6,
    stop=None,
)

p = llm.invoke(
    [{
        "role": "user",
        "content": "What does the md file do?", 
    },
    {
        "role": "ai",
        "content": "Upload the file", 
    },
    {   
        "role": "user",
        "content": [{
            "type": "media",
            "file_uri": "gs://[BUCKET]/README.md", 
            "mime_type": "text/markdown"
            },]
    }]
)

pprint.pprint(p)