Open k-y-le opened 7 months ago
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:
Problem
Hi- I’m working with Clay, a foundation model for Earth Observation data. I’m exploring what an integration with Jupyter-ai would look like, but have a couple questions due to the differences between a vision based model (Clay) and a language model such as all the integrations currently available within jupyter-ai.
There are a couple tasks I would like to accomplish with jupyter-ai and Clay, to start. First, to create embeddings, which currently is possible via an API — inputs are a geojson polygon and outputs are a list of geojsons.
The second task is to query embeddings (do similarity search) — in this case, the user offers the ID of an embedding as an input to the API (can be created in the task above ^ or otherwise and receives back a list of recommended similar embeddings (which can then be converted to geojson)
Proposed Solution
I can think of a few ways to make this happen:
Additional context
I understand that jupyter-ai is meant to be vendor-agnostic, so perhaps the best option is to stick with text-based outputs, as best possible (within Clay, we’re working on translating EO data into text formats, which would be helpful, but is still a ways off), but I think it would be a loss if we didn’t consider the ways in which non-text inputs and outputs can be made available. This is probably part of a larger conversation about how jupyter-ai is set up, and so I wanted to create an issue to hear if there are other thoughts on the subject. Happy to provide more context as needed.