modelcontextprotocol / docs

The documentation and specification for the Model Context Protocol (MCP)
https://modelcontextprotocol.io
Creative Commons Attribution 4.0 International
80 stars 30 forks source link

Docs should clarify how Claude Desktop treats resources #45

Open jackdeansmith opened 1 week ago

jackdeansmith commented 1 week ago

The docs state:

Resources are designed to be application-controlled, meaning that the client application can decide how and when they should be used. For example, one application may require users to explicitly select resources, while another could automatically select them based on heuristics or even at the discretion of the AI model itself.

I don't think it's immediately clear how Claude Desktop treats resources. From playing around with it, it seems like it requires explicit user selection at least for now. I think it would be useful to make this explicit in the documentation so servers authors know what to expect when debugging.

dsp-ant commented 6 days ago

That sounds reasonable. We should make sure it reflects that Claude Desktop is purely an example. There are other MCP clients. Maybe, if you have a chance, would you give it a shot to make that change and send a PR? I'd much appreciate that.

evalstate commented 6 days ago

@dsp-ant. My testing so far shows:

(edit: **all messages can have attachments - I was confused by the UX moving the icon from the left to the right and the slide-out behaviour).

burningion commented 10 hours ago

Just want to comment here and say that Tool Results should be able to pass URI's back to Model Context for inclusion, maybe via another prompt for permission?

evalstate commented 10 hours ago

In the spec, Resource URI's are mandatory in Tool Responses.

https://github.com/modelcontextprotocol/specification/blob/bb5fdd282a4d0793822a569f573ebc36804d38f8/schema/schema.ts#L482-L509

The Client would receive that but it's not specified exactly how the Resource would be presented to the LLM.

The Server can suggest an Audience [ User | Assistant ] and a Priority for each Resource. At the moment I'm interpreting "User" as being "Display to User" and "Assistant" as "Include in Context" with none specified as "Both".

I think a common case may be the Server wanting to return a set of Resource Templates to the Client, so that the User or Assistant can get them on demand if wanted in the next turn. That's obviously not prohibited in the spec.

There's a similar case where the User may wish to upload a binary to send to a tool, for the result to go in context (e.g. Image -> OCR Tool -> Markdown). That's purely a Client implementation question. I'm tempted to tabulate these scenarios/options. There's also what Claude Desktop actually does vs. other Client implementations.

There's a couple of other discussions that seem relevant so I'll paste them here: https://github.com/orgs/modelcontextprotocol/discussions/54 https://github.com/modelcontextprotocol/specification/discussions/90

burningion commented 8 hours ago

Thanks, this is helpful context too.

Now that I've got Resources being returned from my API, I'd ideally have multi-modal instances of my Resource responses.

For example, right now I'm returning Video Files from a search, which are then query able, based upon a separate, analysis pipeline I've built happening in my API.

Ideally I'd be able to pass in a set of embedded images in responses like this:

Screenshot 2024-12-04 at 10 43 06 AM

And show thumbnails inline for each of these detected scenes.

Maybe I'm thinking of a Resource in the wrong way? Or maybe I should include a hierarchy of Resources in my response? IE imply that a set of Resources belong to a single video?

Screenshot 2024-12-04 at 10 45 05 AM

It seems the protocol specifies a single mimetype for each Resource:

Screenshot 2024-12-04 at 10 45 05 AM