AshwinPathi / claude-api-py

Unofficial Python API for Anthropic's Claude LLM
https://pypi.org/project/claude-api-py/
MIT License
108 stars 11 forks source link

Support for File Upload #3

Closed Xceron closed 11 months ago

Xceron commented 11 months ago

Hey! Stumbled upon this project, it looks really promising, and the code is very clean, thank you very much for building this!

I wanted to build support for file uploading. For this, I added this into claude_client.py:

    def convert_file(
        self, organization_uuid: str, file_path: str
    ) -> Optional[JsonType]:
        """Uploads a file"""
        CONVERT_DOCUMENT_API_ENDPOINT = "/api/convert_document"
        request_body = {
            "organization_uuid": organization_uuid,
            "file": open(file_path, 'rb'),
        }
        header = {}
        header.update(self._get_default_header())
        header.update({"content-type": "multipart/form-data"})
        response = custom_requests.post(
            self._get_api_url(CONVERT_DOCUMENT_API_ENDPOINT),
            headers=header,
            request_body=request_body,
        )
        if not response.ok:
            return None
        return response.json()

However, this request fails due to the encoding in custom_requests.py#L53.

Could you please nudge me into the right direction, what I have to change/look further into? I've also noticed that your message endpoint do have support for attachments, but are passed an empty List every time. Any thoughts on how you'd like to have an message endpoint which supports files as well?

If you do not wish to have file uploads in your project, just close this issue, no worries.

AshwinPathi commented 11 months ago

Hi @Xceron , thanks for working on this. The empty attachments list is just a placeholder, and I'd like to have attachments in the future.

tbh I haven't experimented with attachments yet, so I'll look into this further and get back to you.

AshwinPathi commented 11 months ago

ok @Xceron turns out its pretty involved.

In your current code, when you just call open() on the file, it will return some random object which can't be decoded, so I would fix that first.

However, the bigger issue is that the /api/convert_document endpoint actually uses a FormData field, which is a little hard to do with vanilla urllib. Ex.

image

If you can somehow make it work with the requests library or if you find a clean way to do this with urllib, I would greatly appreciate that.

I believe once you complete this, the data that /api/conver_document returns can be directly passed into the attachments list and be sent through the API.

An attachment is basically just a JSON that looks like:

{
    "file_name": FILE_NAME,
    "file_type": FILE_TYPE,
    "file_size": FILE_SIZE,
    "extracted_content": RAW_FILE_CONTENTS,
}

ex:

image
Xceron commented 11 months ago

Hey, thanks for getting back to me!

I am kinda stuck, let me walk you through the things I did:

The result then looks like this:

    def convert_file(
            self, organization_uuid: str, file_path: str
    ) -> Optional[JsonType]:
        """Uploads a file"""
        payload = {"orgUuid": organization_uuid}

        debug_request_catcher = "test"

        files = [
            ('file', (file_path, open(file_path, 'rb'), 'application/pdf'))  # TODO: Infer mimetype from file extension
        ]
        header = {}
        header.update(self._get_default_header())
        response = requests.request(
            "POST",
            f"https://{debug_request_catcher}.requestcatcher.com/test",
            headers=header,
            data=payload,
            files=files
        )
        if not response.ok:
            return None
        return response.json()

The file uploads successfully to requestcatcher, so the upload itself works. However, if I change the endpoint to claude, I will get permission errors (just like in postman). So I seem to be missing something. There is also a different project which reverse engineered the API in nodejs, their upload endpoint is here. However, I am not that good in understanding nodejs to see my error.

AshwinPathi commented 11 months ago

@Xceron I think the requests library does something under the hood that makes the claude api reject any requests using it, or at least, thats what I experienced so far.

Thats why I made the custom_requests.py library in urllib, since these seemed to bypass whatever the issue with the requests library was (in fact, I used the exact same headers and body for urllib and requests and requests didnt work and urllib did!).

node.js fetch likely uses a different underlying set of headers/params/etc. I looked into that a bunch when I initially started reverse engineering and that was my conclusion.

Xceron commented 11 months ago

This would at least explain the issues I were facing.

My current workaround is me parsing the contents manually and then adding the transcribed text to the message, seems to work well enough thus far.

AshwinPathi commented 11 months ago

@Xceron that sounds interesting. I tried hacking at this problem as well, and I basically just made my own FormData class as an input parameter to my custom requests post method.

Form data is basically just a special body + a few extra headers on top of POST, so it shouldn't be too difficult to implement with the current framework.

I can try getting an implementation of this soon.

AshwinPathi commented 11 months ago

Hi @Xceron , I managed to implement sending attachments. You can take a look at the changes I made in this commit: ddb21ae.

Since the requests library still doesn't work, I made my own FormData class and manually encoded files. An example use case is as follows:

# all the boilerplate setup....
client = claude_client.ClaudeClient(SESSION_KEY)
organizations = client.get_organizations()
claude_obj = claude_wrapper.ClaudeWrapper(client, organizations[0]['uuid'])

conversation_uuid = claude_obj.get_conversations()[0]['uuid']
claude_obj.set_conversation_context(conversation_uuid)

# Actual attachment sending
attachment = claude_obj.get_attachment('/some/random/attachment.pdf')
response = claude_obj.send_message("", attachments=[attachment])
print(response)

There are some dumb parts about this implementation (especially the part about checking whether or not a file is a text based file). If you think anything can be changed, feel free to open a PR. Also, if you are up for it, maybe you can explore why requests vs urllib seems to have different behavior.

Let me know if this works for you.

Xceron commented 11 months ago

Hi @Xceron , I managed to implement sending attachments. You can take a look at the changes I made in this commit: ddb21ae.

Since the requests library still doesn't work, I made my own FormData class and manually encoded files. An example use case is as follows:

# all the boilerplate setup....
client = claude_client.ClaudeClient(SESSION_KEY)
organizations = client.get_organizations()
claude_obj = claude_wrapper.ClaudeWrapper(client, organizations[0]['uuid'])

conversation_uuid = claude_obj.get_conversations()[0]['uuid']
claude_obj.set_conversation_context(conversation_uuid)

# Actual attachment sending
attachment = claude_obj.get_attachment('/some/random/attachment.pdf')
response = claude_obj.send_message("", attachments=[attachment])
print(response)

There are some dumb parts about this implementation (especially the part about checking whether or not a file is a text based file). If you think anything can be changed, feel free to open a PR. Also, if you are up for it, maybe you can explore why requests vs urllib seems to have different behavior.

Let me know if this works for you.

Hey, thanks for coming back to me! The current method does not work for me with any binary file, i.e., PDFs. I went through the code and cannot see any inherent bugs when I traced a request. The only difference between your code and a manual request seems to be the generation of the boundary. Maybe Anthropic changed this on their end? If it works for you: Are you on Linux or macOS? I am using Windows, but did not see any errors indicating an error with the file reading itself.

AshwinPathi commented 11 months ago

@Xceron I've tested on macOS. Do you get a 403 or 500 status code? Also I think the actual boundary text shouldnt matter as long as you place it in the right locations.

MacOS for example might have a boundary that looks like ----Webkitxxxxxxxxxx.... but the requests library boundary looks like a hash.

Ill take a closer look.

Xceron commented 11 months ago

I dont get any error at all as I end up getting into this: https://github.com/AshwinPathi/claude-api-py/blob/713e045e53004f3cc6bee7941bb97b1776d49e63/claude/custom_requests.py#L224

AshwinPathi commented 11 months ago

@Xceron I added some additional fields to the Response message, but on my Mac (and on a Linux computer I've also tested this on), it seems to work. I'm not too sure what could be going on without getting detailed logs.

Maybe as a start, you can omit the response handling code to always throw an error so you can check out whats going on. I'd go from there to figure out what things might be missing.

https://github.com/AshwinPathi/claude-api-py/blob/713e045e53004f3cc6bee7941bb97b1776d49e63/claude/custom_requests.py#L215

If you need any more information from me, let me know.

Xceron commented 11 months ago

I cannot reproduce the error with the new version, seems to be fixed. Thank you very much for your work and patience!

AshwinPathi commented 11 months ago

@Xceron Cool, glad to see it worked!