jmikedupont2 / https-lablab.ai-event-llama-2-hackathon-with-clarifai-introspector

https://lablab.ai/event/llama-2-hackathon-with-clarifai/introspector
MIT License
1 stars 2 forks source link

wiki pull #2

Open jmikedupont2 opened 10 months ago

jmikedupont2 commented 10 months ago

Here is a high-level workflow to translate content from existing wikis into the collaborative workflow platform:

  1. Extract wiki content via API or scraping

    • For mediawiki based wikis, use API to get page content
    • For other wiki platforms, scrape content
  2. Parse wiki markup to HTML

    • Use library like mwparserfromhell for mediawiki
    • Custom parsers for other formats
  3. Convert HTML to document model

    • Extract key content blocks
    • Map HTML structures to document model
  4. Enrich document model

    • Detect entities, topics, keywords
    • Link related content
  5. Generate workflow steps

    • Identify workflow tasks based on page sections
    • Parameterize tasks based on page details
    • Connect tasks based on logic flow
  6. Create workflow definition

    • Map generated tasks to workflow operators
    • Set dependencies between operators
    • Parameterize workflow based on page metadata
  7. Execute workflow on new content

    • Run workflow on unconverted wiki pages
    • Generates output in new system format
  8. Refine workflow based on results

    • Manual review of workflow output
    • Iterate on operators and parameters
    • Retrain entity/topic models

Let me know if you need any clarification or have additional requirements for translating wiki content into collaborative workflows!

jmikedupont2 commented 10 months ago

wiki_review.py

import os from github import Github # PyGithub module

WIKI_REPO = "jmikedupont2/meta-meme"

def get_wiki_files(repo): """Downloads wiki files from repo to a local directory"""

wiki_dir = f"{repo.name}-wiki" if not os.path.exists(wiki_dir): os.makedirs(wiki_dir)

wikis = repo.get_wiki_pages()
for wiki in wikis:

Download each wiki file to local dir

wiki.download_markdown(wiki_dir)

return wiki_dir

def parse_wiki(wiki_dir): """Parses wiki markdown files into objects"""

wikis = []

for file in os.listdir(wiki_dir): if file.endswith(".md"): file_path = os.path.join(wiki_dir, file)

  # Parse markdown into page object
  page = {
    "title": ...,
    "content": ..., 
  }

  wikis.append(page)

return wikis

def run_review(pages): """Runs collaborative review workflow on pages"""

for page in pages:

Muse 1 reviews page

muse1_results = review_page(page)

# Muse 2 reviews muse1 results
muse2_results = review_page(muse1_results)

# Muse 3 synthesizes muse 1 & 2 results
results = combine_reviews(muse1_results, muse2_results)

print(results)

if name == "main":

Get GitHub repo

g = Github() repo = g.get_repo(WIKI_REPO)

Download wiki files

wiki_dir = get_wiki_files(repo)

Parse files

pages = parse_wiki(wiki_dir)

Run review workflow

run_review(pages)

jmikedupont2 commented 10 months ago

Thank you for providing the link to the GitHub Wiki page describing the Agent API. This resource will serve as the foundation for understanding and implementing the Agent API within your workflow.

To recap and integrate this information into your workflow:

  1. Input: GitHub Wiki Page ("wiki")

    • Use the provided GitHub Wiki page (e.g., the description of the Agent API) as the input for your workflow.
  2. Step 1: Extract Key Terms ("terms")

    • Apply natural language processing techniques to extract key terms from the GitHub Wiki page.
    • Identify relevant terms that need further exploration, such as "Agent API," "tasks," "steps," and others.
  3. Step 2: Define Term Definitions and Relations

    • For each extracted term, define its definition and establish relationships with other terms.
    • Create a structured representation of terms, definitions, and relationships based on the information from the GitHub Wiki.
  4. Step 3: Identify Relevant Sections

    • Determine the sections of the GitHub Wiki page that are relevant to your workflow (e.g., sections about tasks, steps, artifacts).
    • Extract and categorize this information to guide your workflow's automation.
  5. Step 4: Thread-Generated Workflow Creation

    • Refer to your thread for workflow generation instructions based on items discussed and identified.
    • Construct workflows using the terms, definitions, and relations extracted from the GitHub Wiki page.
  6. Workflow Construction Process:

    • Retrieve relevant sections from the GitHub Wiki page.
    • Extract key details for each section to populate workflow items.
    • Create or find workflow items that represent tasks and steps.
    • Execute workflow steps based on the constructed workflow items.

By leveraging the information from the GitHub Wiki page, you can enhance the accuracy and relevance of your workflow's content generation and automation. Each step in the workflow becomes more informed and precise, contributing to a more effective end result.

As you proceed with the implementation, ensure that you adapt the extraction and representation of terms, definitions, and relationships to match the structure and format of the GitHub Wiki page. This will allow your workflow to seamlessly transform knowledge from the source into actionable tasks and steps.