Chainlit / chainlit

Build Conversational AI in minutes ⚡️
https://docs.chainlit.io
Apache License 2.0
6.76k stars 878 forks source link

Issue in uploading a .docx file and saving the same #516

Closed amani-acog closed 10 months ago

amani-acog commented 10 months ago

I am running chainlit in a docker container and I have an issue while uploading .docx files. As my task needs to save the docx file and process on it which needs styles to be preserved. So I am using following python code to do that


import chainlit as cl
from docx import Document
from docx.enum.text import WD_PARAGRAPH_ALIGNMENT

def copy_docx_with_styles(input_file, output_file):
    doc = Document(input_file)

    # Create a new Document to copy the styles
    new_doc = Document()

    for paragraph in doc.paragraphs:
        new_paragraph = new_doc.add_paragraph()
        new_paragraph.alignment = paragraph.alignment
        for run in paragraph.runs:
            new_run = new_paragraph.add_run(run.text)
            new_run.bold = run.bold
            new_run.italic = run.italic
            new_run.underline = run.underline
            new_run.font.size = run.font.size
            new_run.font.name = run.font.name

    new_doc.save(output_file)

@cl.on_chat_start
async def start():
    file1 = None

    # Wait for the user to upload a file
    while file1 == None:
        file1 = await cl.AskFileMessage(
            content="Please upload a text file to begin!", accept={"text/plain": [".docx"]}, max_files=1
        ).send()
    # Decode the file
    text_file = file1[0]

Please help or fix issue...
willydouhard commented 10 months ago

text_file is a AskFileResponse which is a chainlit abstraction. You probably need to process the bytes of the file (which you can access with text_file.content) instead of the AskFileResponse.

amani-acog commented 10 months ago

I want to save the content of docx file along with the styles used. So I cannot use text_file.content alone

willydouhard commented 10 months ago

I don't understand why. Content is not the string content of the file, it is the actual bytes of the file. If you were to save it on your disk and open it again with word, you should see the same file.

vivekjainmaiet commented 7 months ago

@amani-acog Is your issue resolved ? Are you able to save user uploaded file to docker container locally for further processing. Please share code as i am also facing same issue.

amani-acog commented 7 months ago

Create a new file in 'wb' mode and write the content to the file.

With open('a.txt', 'wb') as f: f.write(binary_content)

vivekjainmaiet commented 7 months ago

How to get binary_content of uploaded file ? could you correct in your code below "import chainlit as cl from docx import Document from docx.enum.text import WD_PARAGRAPH_ALIGNMENT

def copy_docx_with_styles(input_file, output_file): doc = Document(input_file)

# Create a new Document to copy the styles
new_doc = Document()

for paragraph in doc.paragraphs:
    new_paragraph = new_doc.add_paragraph()
    new_paragraph.alignment = paragraph.alignment
    for run in paragraph.runs:
        new_run = new_paragraph.add_run(run.text)
        new_run.bold = run.bold
        new_run.italic = run.italic
        new_run.underline = run.underline
        new_run.font.size = run.font.size
        new_run.font.name = run.font.name

new_doc.save(output_file)

@cl.on_chat_start async def start(): file1 = None

# Wait for the user to upload a file
while file1 == None:
    file1 = await cl.AskFileMessage(
        content="Please upload a text file to begin!", accept={"text/plain": [".docx"]}, max_files=1
    ).send()
# Decode the file
text_file = file1[0]
op_file = copy_docx_with_styles(text_file, "temp1.docx")
# text = text_file.content.decode()

# Let the user know that the system is ready
await cl.Message(
    content=f"`{text_file.name}` uploaded, it is saved as {op_file} characters!"
).send()"
amani-acog commented 7 months ago

Wait for the user to upload a file

while file1 == None: file1 = await cl.AskFileMessage( content="Please upload a text file to begin!", accept={"text/plain": [".docx"]}, max_files=1 ).send()

Decode the file

text_file = file1[0] text = text_file.content # this is binary type of content

save this 'text' in a file as above said way

vivekjainmaiet commented 7 months ago

@amani-acog Which version of python and chainlit are you using. I am getting error AttributeError: 'AskFileResponse' object has no attribute 'content'

text = text_file.content # this is binary type of content ^^^^^^^^^^^^^^^^^ AttributeError: 'AskFileResponse' object has no attribute 'content'.

vivekjainmaiet commented 7 months ago

@willydouhard I am using latest version of chainlit and i am not able to access file path on docker , it is giving only document name. I am using code provided in example of chainlit document. Could you please help link https://docs.chainlit.io/api-reference/ask/ask-for-file

code

`import chainlit as cl

@cl.on_chat_start async def start(): files = None

# Wait for the user to upload a file
while files == None:
    files = await cl.AskFileMessage(
        content="Please upload a text file to begin!", accept=["text/plain"]
    ).send()

text_file = files[0]

with open(text_file.path, "r", encoding="utf-8") as f:
    text = f.read()

# Let the user know that the system is ready
await cl.Message(
    content=f"`{text_file.name}` uploaded, it contains {len(text)} characters!"
).send()

`

willydouhard commented 7 months ago

Does your setup allow for chainlit to write to the docker volume? Chainlit persist to disk the uploaded files while the session is running.