OpenAdaptAI / OpenAdapt

Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
https://www.OpenAdapt.AI
MIT License
880 stars 115 forks source link

Upload recording feature #787

Open KIRA009 opened 3 months ago

KIRA009 commented 3 months ago

What kind of change does this PR introduce? This PR addresses #724

Summary This PR adds a script to deploy an app to AWS lambda, that is then used by the openadapt app to upload zipfiles of recordings from users.

Checklist

How can your code be run and tested? From the project root, run python -m scripts.recording_uploader.deploy (ensure that you have the necessary aws creds configured). Once the command completes, note the api url in the output, and paste that onto the config.py's RECORDING_UPLOAD_URL variable. Once that is done, start the app, navigate to a recording detail page, and click on the "Upload recording" button. Check the s3 bucket to confirm that the recording has been uploaded (in the form of a zip file)

Other information

abrichr commented 3 months ago

Once we start scaling we should consider supporting B2, e.g. app.py:

"""Lambda-like function for generating a presigned URL for uploading a recording to B2."""

from typing import Any
from uuid import uuid4
import json
from b2sdk.v2 import B2Api, InMemoryAccountInfo

def get_b2_client() -> B2Api:
    """Create and return a B2 client."""
    info = InMemoryAccountInfo()
    b2_api = B2Api(info)
    b2_api.authorize_account("production", "applicationKeyId", "applicationKey")
    return b2_api

def lambda_handler(*args: Any, **kwargs: Any) -> dict:
    """Main entry point for the function."""
    return {
        "statusCode": 200,
        "body": json.dumps(get_presigned_url()),
    }

def get_presigned_url() -> dict:
    """Generate a presigned URL for uploading a recording to B2."""
    bucket_name = "openadapt"
    b2_api = get_b2_client()
    bucket = b2_api.get_bucket_by_name(bucket_name)
    file_name = f"recordings/{uuid4()}.zip"
    file_info = {'how': 'good-file'}

    presigned_url = bucket.get_upload_url(file_name, file_info=file_info)

    return {"url": presigned_url['upload_url'], "upload_auth_token": presigned_url['authorization_token']}

For now let's stick with s3.

KIRA009 commented 3 months ago

Before this is merged, we need to setup the upload url and update in on config.py - RECORDING_UPLOAD_URL