PygmalionAI / aphrodite-engine

PygmalionAI's large-scale inference engine
https://pygmalion.chat
GNU Affero General Public License v3.0
660 stars 80 forks source link

Dockerfile: permission update, configurable build jobs, torch 2.3.0 #465

Closed theobjectivedad closed 3 weeks ago

theobjectivedad commented 3 weeks ago

This PR fixes #458, where under some circumstances UID 1000 needs to write triton cache data to /app/aphrodite-engine/.triton which was previously owned by UID 0.

Additionally, the Dockerfile will now accept a build argument to limit ninja to a top limit of workers for systems with limited RAM. For example, the following command will cap at 5 parallel jobs and 128GB RAM:

docker build . -f docker/Dockerfile \
    --build-arg=MAX_JOBS=5 \
    --memory=128G \
    -t $(APHRODITE_BUILD_TAG)

Finally, I needed to bump to torch==2.3.0 to address the error discussed in #453.

AlpinDale commented 3 weeks ago

Thanks for the PR. The torch bump needs a few more changes, I'll handle it here myself

theobjectivedad commented 3 weeks ago

Thanks for the PR. The torch bump needs a few more changes, I'll handle it here myself

Certainly, I am working on the flash-attn issue mentioned in #453 now and will remove draft status once I've tested the fix. Hopefully this doesn't go too far down the rabbit hole...

AlpinDale commented 3 weeks ago

I bumped the torch version in #467

This PR should be safe now. Would you mind merging the latest dev into your branch?

EDIT: just noticed your PR is targetted to main, can you rebase it to dev please?

AlpinDale commented 3 weeks ago

This LGTM. Any other changes you think we need to make, @theobjectivedad?

AlpinDale commented 3 weeks ago

I'll merge this now as we need to make a release ASAP, thanks for the contribution!