akash-network / awesome-akash

Awesome List of Akash Deployment Examples
Apache License 2.0
306 stars 220 forks source link

add torchbench example: WIP #388

Closed rakataprime closed 1 year ago

rakataprime commented 1 year ago

Work in progress PR for adding torchbench gpu benchmarking sdl

anilmurty commented 1 year ago

SDL needs to be updated to include the "vendor" key as shown here https://docs.akash.network/testnet/example-gpu-sdls/specific-gpu-vendor add:

          attributes:
            vendor:
              nvidia:
rakataprime commented 1 year ago

SDL needs to be updated to include the "vendor" key as shown here https://docs.akash.network/testnet/example-gpu-sdls/specific-gpu-vendor add:

          attributes:
            vendor:
              nvidia:

i have updated for the attributes. I think it would be best to prepackage a notebook for the benchmarks so that people just have to click play all to get the benchmarks. We could probably use shebang in the first cell like !run.sh or !python /workspace/benchmark/install.py models hf_bert hf_Bert_large resnet50 tacotron2 && pytest /workspace/benchmark/test_bench.py -k "(hf_bert or hf_bert_Large or resnet50 or tacotron2)" --ignore_machine_config

that would be the most minimal. You could also persist the json stored benchmarks and try to make some pretty plots too, but if time is of the essence I think we could just add jupyter to the requirments.txt with the minimal template notebook.

anilmurty commented 1 year ago

That would be great @rakataprime - would you like to add to this PR itself?

anilmurty commented 1 year ago

by the way - if you want to test deployments you can use one of these client options https://docs.akash.network/testnet/gpu-testnet-client-instructions - we have a few GPU providers on the testnet now https://akash.praetorapp.com/provider-status (select "testnet" in the "Network Selection" dropdown to see them)

rakataprime commented 1 year ago

That would be great @rakataprime - would you like to add to this PR itself?

I could do either this pr or a new one. Do you have the requirements for what information you want included in that notebook other than the benchmarks? eg github, username, email, wallet address, etc ?

anilmurty commented 1 year ago

we will already be collecting those details via a typeform (right @brewsterdrinkwater ?) but wouldn't hurt to ask for github ID, Discord Handle, and wallet address, I think.

anilmurty commented 1 year ago

in fact I think it may help correlate things for awards

anilmurty commented 1 year ago

@rakataprime - not sure if you are waiting on a response here but we're ok either way re. collecting user info in the jupyter notebook

rakataprime commented 1 year ago

@rakataprime - not sure if you are waiting on a response here but we're ok either way re. collecting user info in the jupyter notebook

I think if we test it and it works well enough we would be ready to merge.

anilmurty commented 1 year ago

Thanks @rakataprime - have you tried this on the testnet? There are 26 GPUs available there right now https://akash.praetorapp.com/provider-status?chainid=testnet-02

anilmurty commented 1 year ago

Thanks again @rakataprime and thanks @chainzero !