the-full-stack / website

Source for https://fullstackdeeplearning.com
1.13k stars 203 forks source link

Update cloud-gpus.csv - Oblivus Cloud #59

Closed dorukalpulgen closed 1 year ago

dorukalpulgen commented 1 year ago

Hello, can you please include Oblivus Cloud in the list?

We offer complete customization of virtual machines. Customers have the flexibility to select each component according to their requirements. As we do not provide pre-set configurations, I have filled in the minimum and maximum values for each row. The on-demand price is calculated based on the smallest virtual machine that can be deployed with the specified GPU.

For more details, you can refer to the following links: https://oblivus.com/pricing/ https://oblivus.com/restrictions/

Thank you!

netlify[bot] commented 1 year ago

Deploy Preview for comfy-licorice-ff5651 ready!

Name Link
Latest commit a40c9eb12b184cb59d8186d0d4311e0901bcf2f2
Latest deploy log https://app.netlify.com/sites/comfy-licorice-ff5651/deploys/646ecb92e66f7a0008553317
Deploy Preview https://deploy-preview-59--comfy-licorice-ff5651.netlify.app
Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

charlesfrye commented 1 year ago

Thanks for the PR! We'd be happy to include Oblivus.

Rather than ranges, we choose roughly comparable configurations based on what is offered by providers of pre-set machines. We typically compare to AWS's offerings. You can use the table to select configurations with the same GPU type and count and see their vCPU/RAM values, like so:

Screenshot 2023-05-24 at 11 24 23 AM

Could you please update your PR with prices for specific configurations?

Also, we don't distinguish instances with NVLink/SXM versus PCIe -- we should probably add it as a new column! #60 -- so could you drop that from the GPU Type column for the V100s? Nevermind, just noticed you gave me commit privileges.

dorukalpulgen commented 1 year ago

Hello,

Thank you for the information. I have attempted to create and add some example pre-set configurations. Please let me know if it's okay now.

Thank you!

charlesfrye commented 1 year ago

Thanks! That looks much better.

I made one more commit before merging, with some smaller:

dorukalpulgen commented 1 year ago

Hey there!

First off, I want to thank you for including Oblivus in the list. I understand that the table has a certain structure, and I don't want to disrupt that. However, I do have a few things I'd like to mention:

  1. We've noticed that many developers and individuals prefer customized resource allocations, rather than paying for the entire virtual machine. For example, configurations like 8vCPU - 8GB RAM - 4 GPUs are quite common in our system. While I'm fine with the pre-set configurations I provided, it would be fantastic if we could somehow indicate that Oblivus offers fully customizable options and that the provided configurations are just examples.

  2. In our case, the pricing per GPU is more important than the on-demand prices of the whole virtual machines, since our infrastructure doesn't have a fixed structure. The current per GPU prices listed are somewhat misleading, as they tend to be higher than our unit prices on the website. This might lead to confusion. If we could revert to the per GPU prices I provided, it would be much appreciated.

  3. I'd like to point out that Tesla V100, RTX A4000, and Quadro RTX 4000 support up to 7 GPUs per instance, while the other GPUs support up to 8 GPUs per instance. That's why I shared some configurations with 7 GPUs, as it represents the maximum value.

I hope this clarifies our perspective. I'm awaiting your response and appreciate your consideration.

Thanks again!

charlesfrye commented 1 year ago

Thanks for the detailed followup, and sorry for the delayed reply.

We've noticed that many developers and individuals prefer customized resource allocations, rather than paying for the entire virtual machine. For example, configurations like 8vCPU - 8GB RAM - 4 GPUs are quite common in our system.

That's a helpful data point, thanks for sharing! I'd love to know if you have a sense of what kinds of workflows those users are running -- ML training/inference, rendering, mining, or something else.

We are only trying to serve an audience that's running ML workflows, and really only neural networks.

Our bias is also towards instances that can support neural network training workflows, which look different from inference workflows, e.g. less RAM. I will monitor this issue and resolve if we see more interest in GPU servers with different configurations as workflows move away from training and towards inference, following the rise of promptable foundation/pre-trained models.

While I'm fine with the pre-set configurations I provided, it would be fantastic if we could somehow indicate that Oblivus offers fully customizable options and that the provided configurations are just examples.

You're right, we currently only state that GCP has configurable instances! That's our bad and I will fix, #64.

In our case, the pricing per GPU is more important than the on-demand prices of the whole virtual machines, since our infrastructure doesn't have a fixed structure.

While it would be a noble goal, the purpose of our table isn't to show the price implications of every configuration option from every provider. There's just too much heterogeneity in offerings for a table to make sense.

Instead, we provide a high-level overview of what's available, with a focus on standardization. Since not every provider -- including many major providers -- discretely prices GPUs, we don't pull that out into a separate column. The per-GPU column is only for easing price comparison across setups with varying numbers of GPUs.

Tesla V100, RTX A4000, and Quadro RTX 4000 support up to 7 GPUs per instance, while the other GPUs support up to 8 GPUs per instance. That's why I shared some configurations with 7 GPUs, as it represents the maximum value.

Thanks for the context! An 8x configuration is standard -- cf Cudo and RunPod for A4000; AWS, Datacrunch, GCP, and Lambda for V100. Given our goal of standardization, we're sticking with card counts of 1, 2, 4, 8 and 16. See also #65.

dorukalpulgen commented 1 year ago

Hey, I appreciate the thorough information you provided! I now have a clear understanding.

The table you made is incredibly helpful, and I want to express my gratitude for creating such a valuable resource for the community. Also, thanks for including us.