Closed jakestanton2016 closed 1 month ago
Good morning,
Other remarks:
sda 8:0 0 111.8G 0 disk ├─sda1 8:1 0 1G 0 part └─sda2 8:2 0 110.7G 0 part nvme1n1 259:0 0 953.9G 0 disk └─md0 9:0 0 1.4T 0 raid0 /etc/resolv.conf /etc/hostname /dev/termination-log /etc/hosts nvme0n1 259:1 0 465.8G 0 disk └─md0 9:0 0 1.4T 0 raid0 /etc/resolv.conf /etc/hostname /dev/termination-log /etc/hosts
This is bad practice on multiple levels: 1. RAID0 generally decreases stability by adding yet another SPOF. 2. RAID0 needs equally sized drives to work properly.
curl -sk https://provider.gpu3090.ddns.net:8443/status | jq { "cluster": { "leases": 3, "inventory": { "active": [ { "cpu": 100, "gpu": 0, "memory": 100663296, "storage_ephemeral": 68157440 }, { "cpu": 100, "gpu": 0, "memory": 268435456, "storage_ephemeral": 268435456 }, { "cpu": 100, "gpu": 0, "memory": 100663296, "storage_ephemeral": 6291456 } ], "available": { "nodes": [ { "name": "node1", "allocatable": { "cpu": 4000, "gpu": 0, "memory": 12308656128, "storage_ephemeral": 224812917593 }, "available": { "cpu": 3080, "gpu": 0, "memory": 12087369728, "storage_ephemeral": 224812917593 } }, { "name": "node2", "allocatable": { "cpu": 8000, "gpu": 1, "memory": 8126627840, "storage_ephemeral": 1349073758432 }, "available": { "cpu": 5195, "gpu": 0, "memory": 3229161472, "storage_ephemeral": 1348194003168 } } ] } } }, "bidengine": { "orders": 0 }, "manifest": { "deployments": 0 }, "cluster_public_hostname": "provider.gpu3090.ddns.net", "address": "akash16q46mn8tm7vtwn4h9rugr8ptch6shp2qda7y89" }
This suggests that there's an issue either with the nVidia drivers, nVidia toolkit, or K8S plugin : Please refer to https://akash.network/docs/providers/build-a-cloud-provider/gpu-resource-enablement/
Please fix these issues and we can move on.
Shimpa
Thanks… working on fixes
I have made the fixes suggested. Lost on-time performance due to downtime. Please re-evaluate. Thank you.
The provider is not answering to orders that include a GPU, even though a GPU is present in the inventory and available.
Requested resources: 1 GPU nVidia rtx3090 1 CPU 2 Gi RAM 2 Gi storage
Please make sure your provider is fully functional.
thanks. Shimpa
Provider is still offline and it's been quite long time. Please feel free to reopen if needed.
Prerequisite Steps:
1. Make sure your provider has community provider attributes and your contact details (email, website):
Ref documentation:.
2. Make sure your provider *.ingress resolves to your provider IP (ideally worker node IP)
Example:
3. Please make sure your Akash provider doesn't block any Akash specific ports.
Audit Steps:
1. Title the issue: " [Provider Audit]: Provider Address" (e.g. "[Provider Audit]: provider.europlots.com")
2. Wait for response via comments. If no issues during provider Audit, process will be complete, provider should start bidding on leases, and Audit ticket will be closed.
3. If there are issues during the provider Audit, debug those issues, and Audit will be complete.
4. Audit Issue will be closed by core team member.
Leave contact information (optional)