issues
search
skypilot-org
/
skypilot
SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k
stars
513
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Fluidstack] sky launch can leak instances when instance creation times out
#4392
Xe
opened
37 minutes ago
0
[Examples] Specify version for vllm cuz vllm v0.6.4.post1 has issue
#4391
HysunHe
closed
5 hours ago
0
[Core] Skip worker ray start for multinode
#4390
Michaelvll
opened
12 hours ago
0
[k8s] Move setup and ray start to pod args to make them async
#4389
Michaelvll
opened
1 day ago
0
[Jobs] Disable deduplication for logs
#4388
Michaelvll
closed
22 hours ago
0
show logs for storage mount
#4387
zpoint
opened
1 day ago
0
Event based smoke tests -- manged jobs
#4386
zpoint
opened
1 day ago
1
Does sky still supports local cluster mode?
#4385
wander3r
closed
1 day ago
2
[docs] Specify compartment for OCI resources.
#4384
HysunHe
opened
2 days ago
0
[OCI] set zone in the ProvisionRecord
#4383
HysunHe
closed
22 hours ago
0
[UX] Remove K80 and M60 from common GPU list
#4382
Michaelvll
opened
2 days ago
0
[k8s] Handle apt update log not existing
#4381
romilbhardwaj
closed
2 days ago
0
[Jobs] Allow logs for finished jobs and add `sky jobs logs --refresh` for restartin jobs controller
#4380
Michaelvll
opened
3 days ago
0
[k8s] Support exec based auth kubeconfigs on controllers
#4379
romilbhardwaj
opened
4 days ago
0
[UX] Allow disabling ports in CLI
#4378
Michaelvll
opened
4 days ago
0
Add Lambda's GH200 instance type
#4377
cbrownstein-lambda
closed
4 days ago
0
[ux] display human-readable name for controller
#4376
cg505
closed
3 days ago
0
[Catalog] Lambda catalog fetcher fails due to GH200
#4375
Michaelvll
closed
4 days ago
1
[k8s] Default image cannot install conda package in base env
#4374
Michaelvll
opened
5 days ago
0
[Core] Expired credentials causes unexpected failure of `sky launch`
#4373
Michaelvll
opened
5 days ago
1
[UX] Automatically source the skypilot runtime when ssh to the cluster and SKYPILOT_DEV=1
#4372
cblmemo
opened
5 days ago
0
requirements.txt cleanup
#4371
kristopolous
closed
6 days ago
1
[Docs] Fix ask ai location
#4370
Michaelvll
closed
6 days ago
0
Mount cached mode
#4369
landscapepainter
opened
6 days ago
1
[Serve] Feature request: support num_nodes for the Controller
#4368
HysunHe
opened
6 days ago
0
[Core] NoCloudAccessError check is escaped from storage sync
#4367
HysunHe
closed
5 days ago
1
[Core] NoCloudAccessError check is escaped from storage sync
#4366
HysunHe
closed
6 days ago
0
Preliminary Vast AI support
#4365
kristopolous
opened
6 days ago
1
[DAG] Run global optimization on controller for task placement
#4364
andylizf
closed
6 days ago
1
[Core] Environment variables should be parsed at task execution, not `sky.Task` instantiation
#4363
romilbhardwaj
opened
6 days ago
0
[WIP][Serve] Enable launching multiple external LB on controller.
#4362
cblmemo
opened
6 days ago
0
[Docs] Fix some issues with Managed Jobs example.
#4361
concretevitamin
closed
6 days ago
0
GCS file mount sync hangs if GCP credentials are expired
#4360
cg505
opened
6 days ago
0
[FluidStack] Fix provisioning and add new gpu types
#4359
mjibril
closed
3 days ago
0
[Jobs] Remove assertion for one single controller resources.
#4358
cblmemo
closed
6 days ago
0
[k8s] fix managed job issue on k8s
#4357
nkwangleiGIT
closed
5 days ago
0
[Serve] Enable multiple ports in SkyServe replicas
#4356
Conless
opened
1 week ago
3
[Core] Unblock user program for SIGINT
#4355
Michaelvll
opened
1 week ago
0
[Docs] resize image and move path up a level.
#4354
concretevitamin
closed
1 week ago
0
decorated functions are not properly typechecked
#4353
cg505
opened
1 week ago
0
[Docs] Update k8s docs
#4352
romilbhardwaj
closed
1 week ago
0
remove empty file mount from yaml config
#4351
cg505
opened
1 week ago
0
[AWS] Not robust identity checking
#4350
Michaelvll
closed
1 week ago
1
[Serve] Failure-count based unrecoverable failure detection
#4349
cblmemo
opened
1 week ago
0
[Serve] Fall back to latest ready version when detects unrecoverable failure
#4348
cblmemo
opened
1 week ago
0
Added user agent string for catalog downloading request
#4347
shashank2000
closed
1 week ago
0
sky jobs launch on Kubernetes seems not working now
#4346
nkwangleiGIT
opened
1 week ago
13
Update `--env-file` to sky doc
#4345
zpoint
closed
1 week ago
0
Doesn't use right GCP config path on Windows
#4344
alexkreidler
opened
1 week ago
0
[k8s] Leaked kubectl port-forward processes
#4343
romilbhardwaj
opened
1 week ago
0
Next