skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k stars 512 forks source link

[Bug] Smoke tests `--generic-cloud` flag is ignored when specified cloud is not in `default_clouds_to_run` #4327

Closed andylizf closed 1 week ago

andylizf commented 2 weeks ago

When running smoke tests with --generic-cloud gcp, the tests still use AWS instead of GCP, which is inconsistent with the behavior documented in CONTRIBUTING.md.

This appears to be related to PR #4182 which changed the default clouds. The current implementation in conftest.py makes default_clouds_to_run override the user-specified --generic-cloud value if that cloud is not in default_clouds_to_run.

Example:

$ pytest tests/test_smoke.py::test_minimal --generic-cloud gcp -v -s
[minimal] Failed.
[minimal] Reason: unset SKYPILOT_DEBUG; s=$(sky launch -y -c t-minimal-f9 --cloud aws tests/test_yamls/minimal.yaml)
                                                                                 ^^^^
                                                              Note that AWS is used instead of the specified GCP

This behavior is inconsistent with the documentation in CONTRIBUTING.md which states:

https://github.com/skypilot-org/skypilot/blob/24982a1f0ca411862b67660e580e2360310812b2/CONTRIBUTING.md?plain=1#L44-L48

I believe the default_clouds_to_run should not affect the validity of user-specified --generic-cloud. If a user explicitly specifies a cloud via --generic-cloud, that choice should be respected regardless of whether that cloud is in the default set.

The issue was introduced in PR #4182 which changed how default clouds are handled.

cc @cg505

cg505 commented 2 weeks ago

I'll look into this. Something doesn't quite add up, because if it's as you say, --generic-cloud aws (as in the docs) wouldn't have worked before my change.

andylizf commented 2 weeks ago

Understood - it might also be that the documentation needs updating.

cg505 commented 1 week ago

Yeah, looks like this never worked properly. (pytest tests/test_smoke.py --generic-cloud aws runs on GCP if you go to the commit before my change.)

Throwing up a PR to fix.