Closed kwmonroe closed 7 years ago
I should note the reason this is a big deal to me.. Big data bundles have their constraints because apps may not even start with cloud default instance sizes (1cpu, 1.7g ram for the big 3). Sometimes a cloud will surprise me and get itself up before the timeout, but in general, matrix is not useful for big data on clouds -- without handling constraints, it's only reliable on lxd (where constraints are moot as long as the host machine is big enough).
@petevg mentioned this may be libjuju that loses the constraint somewhere, so if there's a better place to open this issue, please lmk.
This may not be the matrix... Hang tight while I get some feedback on:
@kwmonroe Interesting. I was going to say that this is a python-libjuju bug, because matrix basically just calls out to python-libjuju and asks it to deploy stuff. But if you were able to replicate with the vanilla client, that might mean that it's a more interesting bug ...
Pulling into the Beta milestone just to remind me to look at it. May not be a matrix bug, per above discussion.
Set as "blocked", as this is either an issue in Juju or an issue in python-libjuju -- needs to be fixed there, and then will automatically be fixed here ...
Confirmed this is not a bug in matrix. It's all about juju failing to handle constraints on bundle "services". If I move the constraints to bundle "machines", all is well:
Controller: gce-w
Model Cloud/Region Status Machines Cores Access Last connection
job-11-exact-cattle* google/us-west1 available 0 - admin 26 minutes ago
job-11-matrix-viable-goblin google/us-west1 available 5 36 admin 10 minutes ago
Feel free to close this unless you want to keep it around for tracking purposes.
@kwmonroe Phew! I was worried that I had missed something about the constraints (there's a fair amount of code in python-libjuju that's just me translating that darn plan that the api generates into something that the api will actually accept on deploy). Glad to hear that it wasn't me :-)
Hi friends! I'm running the spark-processing bundle.yaml through cwr, which invokes bundletester/matrix. Here's my yaml:
https://api.jujucharms.com/charmstore/v5/spark-processing/archive/bundle.yaml
Note the
constraints: "mem=7G root-disk=32G"
on the spark application, for example. When matrix spins up my bundle for the first time (not chaotically), it seems to lose those constraints. I know this because a 7g machine on aws should be 2 cores, while a 7g machine on gce is 8 cores. Here's an example of the matrix models that were created on both aws and gce. Note theCores
column:The
ci-70/job-22-steady-mutt
models are correct (verified by ssh'ing to thespark/0
unit and seeing 8 cores on gce, for example). The*-matrix-*
models are incorrect (verified by ssh'ing to thespark/0
unit and seeing only 1 core on gce).Why you lose my constraints?