[Feature]: on-prem gateway

morfeusys commented 4 months ago

Problem

For now I have to create some gateway via AWS or other supported cloud to start any service with dstack. But if I have on-prem pool of hardware, I actually don't need to publish my inference API and configure any DNS only to have scalable service running.

For example: developer needs to start inference service of any model for testing and development purposes. For how they have to do one extra step to start sending HTTP requests - configure gateway and DNS. But it may be too complicated if they only want to utilise existing on-prem hardware. Moreover - they may decide to make service available only in private on-prem network instead of publishing it in the cloud. Also using cloud services like AWS could be prohibited by company security policy.

Solution

Some gateway component that could be started locally.

Workaround

Run task instead of service:

lack of scaling
port forwarding

Would you like to help us implement this feature by sending a PR?

No

r4victor commented 4 months ago

@morfeusys, thanks for the issue. Support for on-prem gateways makes a lot of sense and should certainly be on our roadmap. The simplest implementation would allow you to provision a gateway on an on-prem instance, and then you would configure a domain to point to the gateway instance IP – same as for cloud gateways. This should work both for public and private networks.

Now, you also mention you "don't need configure any DNS". Currently, dstack uses domains to identify and route services. Surely, you can send service requests to instance IP directly but this won't allow scaling beyond one gateway instance (which is currently not possible but planned).

Do you have a particular vision on how services without domains should work? Will on-prem gateways with DNS setup work for you (as it should be the easiest way to support on-prem gateways)?

peterschmidt85 commented 3 months ago

This issue is stale because it has been open for 30 days with no activity.

dstackai / dstack