OpenAdaptAI / OpenAdapt

Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
https://www.OpenAdapt.AI
MIT License
886 stars 116 forks source link

Implement Publishing #46

Closed abrichr closed 10 months ago

abrichr commented 1 year ago

We would like to make it easy to publish puterbot.db:

python -m puterbot.publish  "<task_description>" [<puterbot.db>]

How can we store this in a decentralized way? e.g.:

abrichr commented 1 year ago

Goals:

  1. Make it easy for people with limited experience to get started quickly
  2. Minimize development effort so we can move quickly
  3. Avoid "crypto hype"

Option 1: IPFS + Ethereum Smart Contract

Description: Upload the puterbot.db file to IPFS and use an Ethereum smart contract to associate a task description with the file's content identifier (CID).

Pros:

Cons:

Description: Upload the puterbot.db file to IPFS, create an NFT with the file's CID and task description as metadata, and use the NFT to represent ownership and access rights.

Pros:

Cons:

Description: Store the puterbot.db file on Filecoin and use a smart contract to associate a task description with the file's CID.

Pros:

Cons:

Description: Store the puterbot.db file on Arweave and use a smart contract to associate a task description with the file's transaction ID.

Pros:

Cons:

Description: Upload the puterbot.db file to IPFS without using blockchain or NFTs. Associate task description with the file's CID through a centralized or distributed database.

Pros:

Cons:

Description: Store the puterbot.db file on Skynet, a decentralized storage network built on Sia, and associate a task description with the file's link.

Pros:

Cons:

abrichr commented 1 year ago

ChatGPT:

As of my knowledge cutoff date in September 2021, among the options discussed, using IPFS without blockchain or NFTs (Option 5) is likely to be the cheapest and can potentially be free. IPFS allows users to upload and share files in a decentralized manner without incurring transaction fees or storage costs. However, there are some important considerations:

IPFS does not inherently provide data persistence guarantees. This means that if no IPFS node is actively hosting the file (i.e., "pinning" the file), it may become unavailable over time. To ensure data persistence, users may need to set up their own IPFS nodes or use third-party pinning services, which may have associated costs.

While IPFS itself is free to use, third-party pinning services that offer data persistence guarantees may charge fees based on the amount of data stored and the duration of storage.

If data persistence is not a critical requirement, and users are willing to host their own IPFS nodes or rely on the IPFS community to keep the data available, IPFS can be a cost-effective solution.

It's important to note that prices and offerings for decentralized storage solutions may have changed since my knowledge cutoff date. I recommend checking the latest information from the official websites of IPFS and other decentralized storage providers to get up-to-date pricing details and explore any free options that may be available.

https://pypi.org/project/IPFS-Toolkit/

https://github.com/emendir/IPFS-Toolkit-Python/issues/5

abrichr commented 1 year ago

Or just use S3 with e.g. https://github.com/aws/aws-sdk-pandas 😅

abrichr commented 1 year ago

Or https://github.com/magic-wormhole/magic-wormhole:

puterbot/share.py:

"""Copy a recording from one computer to another

Usage:

    python -m puterbot.share send --recording_id=1 --output_folder=output
    python -m puterbot.share receive --output_folder=output
"""

import os
import sys
import fire
import socket
import datetime
from wormhole import wormhole
import utils  # Import the utils module
import config  # Import the config module

def send_recording(recording_id, output_folder):
    # Export the recording to a folder
    export_recording_to_folder(recording_id, output_folder)

    # Get the current hostname (of the sender)
    hostname = socket.gethostname()

    # Get the current date and time
    dt_str = utils.get_now_dt_str()

    # Format the recording file name
    recording_file = os.path.join(output_folder, f'puterbot.{hostname}.{dt_str}.db')

    # Create a wormhole
    with wormhole.create() as w:
        # Send the recording file
        w.send_file(recording_file)

        # Print the wormhole code
        print("Wormhole code:", w.get_code())

        # Wait for the transfer to complete
        w.wait_for_transfer_to_finish()

def receive_recording(output_folder):
    # Get the wormhole code from the user
    code = input("Enter the wormhole code: ")

    # Create a wormhole
    with wormhole.create() as w:
        # Set the wormhole code
        w.set_code(code)

        # Receive the recording file
        result = w.get_file()

        # Save the received file to the output folder
        # Use the filename provided by the sender
        output_file = os.path.join(output_folder, result['filename'])
        with open(output_file, 'wb') as f:
            f.write(result['file_data'])

        # Wait for the transfer to complete
        w.wait_for_transfer_to_finish()

# Create a command-line interface using python-fire and utils.get_functions
if __name__ == "__main__":
    fire.Fire(utils.get_functions(sys.modules[__name__]))

puterbot/utils.py:

import datetime

def get_now_dt_str(dt_format=config.DT_FMT):
    """
    Get the current date and time as a formatted string.

    Args:
        dt_format (str): The format to use for the date and time string.

    Returns:
        str: The current date and time formatted as a string.
    """
    # Get the current date and time
    now = datetime.datetime.now()

    # Format the date and time according to the specified format
    dt_str = now.strftime(dt_format)

    return dt_str

puterbot/config.py:

    DT_FMT = "%Y-%m-%d_%H-%M-%S"
Mustaballer commented 1 year ago

Hi @abrichr, it seems like the export_recording_to_folder(recording_id, output_folder) function is not defined in the share.py file. Can you please clarify where this function is imported from or provide the code that defines it? Thank you.

abrichr commented 1 year ago

@Mustaballer none of this has been implemented, the only place this code lives so far is this issue (all of it is untested).

If you want to include it in your PR (or better yet create a new one) that would be great!