0-5788719150923125 / praxis

as above, so below
https://src.eco
MIT License
2 stars 1 forks source link
ai devops docker hivemind ml moe python ryan transformers

praxis

Praxis is the process by which a theory, lesson, or skill is enacted, embodied, realized, applied, or put into practice.

Terminal

what we're building

The Praxis swarm is a decentralized, peer-to-peer, always online, continuously-learning AI intelligence - and a place to practice computational alchemy. With Hivemind directly-integrated into core layers of the model itself, our goal is to build an expert model that is small and simple, easy to parallelize and performant at a scale of hundreds/thousands of peers. We will do this via a sparse mixture of experts, curated routing, algorithmic switching and weighted self-modeling of remotely-hosted peers.

architecture

join us

install

Setup a virtual environment:

source make-venv.sh

Alternatively, you may use the VSCode command bar (Ctrl + Shift + P), and choose: Python: Create Environment...

Then, install dependencies:

# Install core model dependencies
pip install -e .

# Install training dependencies
pip install -e .[all]

contribute to the swarm

To donate your compute:

python run.py

To view all supported command-line arguments:

python run.py --help

recommendations

We recommend you use a batch_size of at least 16, if possible:

python run.py --batch_size 16

The reason for this is that we have implemented an oversampling mechanism, which can expose your model to longer sequences during training (improving generalization and maximum supported sequence length). This oversampling mechanism periodically doubles the sequence length, and scales quadratically at batch sizes of 1, 4, and 16.

do inference

Send a JSON-encoded payload via POST to:

http://localhost:2100/input

This payload should support all arguments in the Transformers text generation API.

Example request:

import requests

url = "http://localhost:5000/input"
payload = {"prompt": "Once upon a time, ", "do_sample": True, "temperature": 0.7}

response = requests.post(url, json=payload)

print(response.status_code)
print(response.json())

local web chat (coming soon!)

Chat and swarm management interface is available here:

http://localhost:2100

to register with transformers

from transformers import AutoConfig, AutoModel, AutoModelForCausalLM, AutoTokenizer
from praxis import PraxisConfig, PraxisForCausalLM, PraxisModel

AutoConfig.register("praxis", PraxisConfig)
AutoModel.register(PraxisConfig, PraxisModel)
AutoModelForCausalLM.register(PraxisConfig, PraxisForCausalLM)

config = PraxisConfig(
    num_embeds=512,
    num_dims=384,
    depth=6,
    num_heads=8,
    device_map="cuda:0",
)

tokenizer_model = "UNSAFE/praxis-4096"
tokenizer = AutoTokenizer.from_pretrained(tokenizer_model)

model = AutoModelForCausalLM.from_config(config)

input_ids = tokenizer.encode("The quick brown fox ")

outputs = model.generate(input_ids, do_sample=True)

print(self.tokenizer.decode(outputs[0], skip_special_tokens=True))
# --> The quick brown fox jumped over a lazy dog.

goals

notes, ideas and random things I want to remember

won't do