Closed yarongilor closed 5 years ago
I can find the subnet by above link.
What's the region name in your sct config?
Expect: region_name: 'us-east-1 us-west-2'
Can you provide your full sct config?
@amoskong I do see the subnet: https://us-west-2.console.aws.amazon.com/vpc/home?region=us-west-2#subnets:search=subnet-5207ee37;sort=SubnetId
@amoskong I checked it before, his code works with on demand
instances but doesn't with spot_low_price
. I suspect the issue is in the spot handling logic
the testing yaml is:
#test_duration: 10080
test_duration: 500
stress_cmd: "cassandra-stress write cl=QUORUM duration=1m -schema 'replication(strategy=NetworkTopologyStrategy,us-eastscylla_node_east=1,us-west-2scylla_node_west=1)' -port jmx=6868 -mode cql3 native -rate threads=100 -pop seq=1..10000"
cassandra_stress_duration: 10080
cassandra_stress_threads: 100
cassandra_stress_population_size: 10000
#n_db_nodes: 3
n_db_nodes: '1 1' #'0 0' # '1 1'
n_loaders: 1 #1
n_monitor_nodes: 1
nemesis_class_name: 'MgmtCli'
#nemesis_class_name: 'MdcChaosMonkey'
#nemesis_class_name: 'DrainerMonkey'
nemesis_interval: 5
user_prefix: 'yaron-2-3-0-multidc'
failure_post_behavior: keep
space_node_threshold: 6442
ip_ssh_connections: 'public'
ami_id_db_scylla_desc: '2-3-0'
use_mgmt: true
mgmt_port: 10090
#scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/f4a2920f80c4bf178217c2553ad65ad7/centos/scylladb-2018.1.repo'
#scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/7b02fff5-e4d0-4e4d-ad12-e605ca4873c2/centos/scylladb-2018.1.repo'
scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/7b02fff5-e4d0-4e4d-ad12-e605ca4873c2/centos/scylladb-2018.1.repo'
scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/44/scylla-manager.repo'
#scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/latest/scylla-manager.repo'
#scylla_mgmt_repo: 'MANAGER_REPO_URL'
#es_url:
#es_user:
#es_password:
#instance_provision: 'spot_low_price'
backends: !mux
aws: !mux
# What is the backend that the suite will use to get machines from.
cluster_backend: 'aws'
# From 0.19 on, iotune will require bigger disk, so let's use a big
# loader instance by default.
instance_type_loader: 'c4.large'
# Size of AWS monitor instance
instance_type_monitor: i3.large
us_east_1_and_us_west_2:
user_credentials_path: '~/.ssh/scylla-qa-ec2'
region_name: 'us-east-1 us-west-2'
security_group_ids: 'sg-c5e1f7a0 sg-81703ae4'
subnet_id: 'subnet-ec4a72c4 subnet-5207ee37'
ami_id_db_scylla: 'ami-0fc423ac17a75570d ami-0da599147b1d9e80d'
ami_db_scylla_user: 'centos'
ami_id_loader: 'ami-0fc423ac17a75570d'
ami_loader_user: 'centos'
ami_id_monitor: 'ami-010f2b2749b78a6c5'
ami_monitor_user: 'centos'
gce: !mux
cluster_backend: 'gce'
user_credentials_path: '~/.ssh/scylla-test'
gce_user_credentials: '~/Scylla-c41b78923a54.json'
gce_service_account_email: 'skilled-adapter-452@appspot.gserviceaccount.com'
gce_project: 'skilled-adapter-452'
gce_image: 'https://www.googleapis.com/compute/v1/projects/centos-cloud/global/images/family/centos-7'
gce_image_username: 'scylla-test'
gce_instance_type_db: 'n1-highmem-8'
gce_root_disk_type_db: 'pd-ssd'
gce_root_disk_size_db: 50
gce_n_local_ssd_disk_db: 1
gce_instance_type_loader: 'n1-highcpu-4'
gce_root_disk_type_loader: 'pd-standard'
gce_root_disk_size_loader: 50
gce_n_local_ssd_disk_loader: 0
gce_instance_type_monitor: 'n1-standard-2'
gce_root_disk_type_monitor: 'pd-standard'
gce_root_disk_size_monitor: 50
gce_n_local_ssd_disk_monitor: 0
scylla_repo: https://s3.amazonaws.com/downloads.scylladb.com/rpm/unstable/centos/branch-1.7/37/scylla.repo
#us_east_1:
# gce_datacenter: 'us-east1-b'
multi_dcs:
gce_datacenter: 'us-east1-b us-west1-b us-east4-b'
databases: !mux
scylla:
db_type: scylla
instance_type_db: 'i3.large'
hi @amoskong, can you please advice - it gets the same failure, testing now on master branch:
Cluster yaron-2-3-0-multidc-db-cluster-5a2ada49 (AMI: ['ami-0fc423ac17a75570d', 'ami-0da599147b1d9e80d'] Type: i3.large): Passing user_data '--clustername yaron-2-3-0-multidc-db-cluster-5a2ada49 --totalnodes 2 --stop-services --seeds 54.172.191.59 --bootstrap false ' to create_instances
Exception in init_resources. Will clean resources
Traceback (most recent call last):
File "/sct/sdcm/tester.py", line 129, in wrapper
return method(*args, **kwargs)
File "/sct/sdcm/tester.py", line 636, in init_resources
monitor_info=monitor_info)
File "/sct/sdcm/tester.py", line 467, in get_cluster_aws
self.db_cluster = create_cluster(db_type)
File "/sct/sdcm/tester.py", line 458, in create_cluster
**cl_params)
File "/sct/sdcm/cluster_aws.py", line 405, in __init__
params=params)
File "/sct/sdcm/cluster.py", line 1451, in __init__
super(BaseScyllaCluster, self).__init__(*args, **kwargs)
File "/sct/sdcm/cluster_aws.py", line 102, in __init__
region_names=self.region_names)
File "/sct/sdcm/cluster.py", line 1270, in __init__
self.add_nodes(num, dc_idx=dc_idx)
File "/sct/sdcm/cluster_aws.py", line 434, in add_nodes
enable_auto_bootstrap=enable_auto_bootstrap)
File "/sct/sdcm/cluster_aws.py", line 220, in add_nodes
instances = self._create_instances(count, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 192, in _create_instances
instances = self._create_spot_instances(count, interfaces, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 134, in _create_spot_instances
subnet_info = ec2.get_subnet_info(self._ec2_subnet_id[dc_idx])
File "/sct/sdcm/ec2_client.py", line 313, in get_subnet_info
resp = self._client.describe_subnets(SubnetIds=[subnet_id])
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
ClientError: An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist
Cleaning up resources used in the test
Exception in setUp. Will clean resources
Traceback (most recent call last):
File "/sct/sdcm/tester.py", line 129, in wrapper
return method(*args, **kwargs)
File "/sct/sdcm/tester.py", line 177, in setUp
self.init_resources()
File "/sct/sdcm/tester.py", line 129, in wrapper
return method(*args, **kwargs)
File "/sct/sdcm/tester.py", line 636, in init_resources
monitor_info=monitor_info)
File "/sct/sdcm/tester.py", line 467, in get_cluster_aws
self.db_cluster = create_cluster(db_type)
File "/sct/sdcm/tester.py", line 458, in create_cluster
**cl_params)
File "/sct/sdcm/cluster_aws.py", line 405, in __init__
params=params)
File "/sct/sdcm/cluster.py", line 1451, in __init__
super(BaseScyllaCluster, self).__init__(*args, **kwargs)
File "/sct/sdcm/cluster_aws.py", line 102, in __init__
region_names=self.region_names)
File "/sct/sdcm/cluster.py", line 1270, in __init__
self.add_nodes(num, dc_idx=dc_idx)
File "/sct/sdcm/cluster_aws.py", line 434, in add_nodes
enable_auto_bootstrap=enable_auto_bootstrap)
File "/sct/sdcm/cluster_aws.py", line 220, in add_nodes
instances = self._create_instances(count, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 192, in _create_instances
instances = self._create_spot_instances(count, interfaces, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 134, in _create_spot_instances
subnet_info = ec2.get_subnet_info(self._ec2_subnet_id[dc_idx])
File "/sct/sdcm/ec2_client.py", line 313, in get_subnet_info
resp = self._client.describe_subnets(SubnetIds=[subnet_id])
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
ClientError: An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist
Cleaning up resources used in the test
Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado/core/test.py:436
Traceback (most recent call last):
File "/sct/sdcm/tester.py", line 129, in wrapper
return method(*args, **kwargs)
File "/sct/sdcm/tester.py", line 177, in setUp
self.init_resources()
File "/sct/sdcm/tester.py", line 129, in wrapper
return method(*args, **kwargs)
File "/sct/sdcm/tester.py", line 636, in init_resources
monitor_info=monitor_info)
File "/sct/sdcm/tester.py", line 467, in get_cluster_aws
self.db_cluster = create_cluster(db_type)
File "/sct/sdcm/tester.py", line 458, in create_cluster
**cl_params)
File "/sct/sdcm/cluster_aws.py", line 405, in __init__
params=params)
File "/sct/sdcm/cluster.py", line 1451, in __init__
super(BaseScyllaCluster, self).__init__(*args, **kwargs)
File "/sct/sdcm/cluster_aws.py", line 102, in __init__
region_names=self.region_names)
File "/sct/sdcm/cluster.py", line 1270, in __init__
self.add_nodes(num, dc_idx=dc_idx)
File "/sct/sdcm/cluster_aws.py", line 434, in add_nodes
enable_auto_bootstrap=enable_auto_bootstrap)
File "/sct/sdcm/cluster_aws.py", line 220, in add_nodes
instances = self._create_instances(count, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 192, in _create_instances
instances = self._create_spot_instances(count, interfaces, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 134, in _create_spot_instances
subnet_info = ec2.get_subnet_info(self._ec2_subnet_id[dc_idx])
File "/sct/sdcm/ec2_client.py", line 313, in get_subnet_info
resp = self._client.describe_subnets(SubnetIds=[subnet_id])
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
ClientError: An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist
ERROR 1-mgmt_cli_test.py:MgmtCliTest.test_mgmt_cluster_healthcheck -> TestSetupFail: An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist
Error receiving message from test: <type 'exceptions.TypeError'> -> ('__init__() takes exactly 3 arguments (2 given)', <class 'botocore.exceptions.ClientError'>, (u"An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist",))
Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado/core/runner.py:75
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
TypeError: ('__init__() takes exactly 3 arguments (2 given)', <class 'botocore.exceptions.ClientError'>, (u"An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist",))
ERROR 1-mgmt_cli_test.py:MgmtCliTest.test_mgmt_cluster_healthcheck -> TestAbortedError: Test aborted unexpectedly
the yaml has:
instance_provision: 'spot_low_price'
+
us_east_1_and_us_west_2:
user_credentials_path: '~/.ssh/scylla-qa-ec2'
region_name: 'us-east-1 us-west-2'
security_group_ids: 'sg-c5e1f7a0 sg-81703ae4'
subnet_id: 'subnet-ec4a72c4 subnet-5207ee37'
ami_id_db_scylla: 'ami-0fc423ac17a75570d ami-0da599147b1d9e80d'
@amoskong, can you please check Yaron's comment?
Hi @yarongilor, Can you clone the latest master to a clean environment, and run again?
I just copied yaml from https://github.com/scylladb/scylla-cluster-tests/issues/671#issuecomment-433863706, and added store_results_in_elasticsearch: False
The setup worked well (I'm also using latest master).
Yaml: y.yaml.txt , Job log: multi-dc-yaron-example.log.txt , Avocado cmdline:
avocado run longevity_test.py:LongevityTest.test_custom_time --job-results-dir ./ --multiplex tests/y.yaml --filter-only /run/backends/aws /run/databases/scylla --show-job-log
problem is still reproduced on master branch. failure details:
Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado/core/test.py:436
Traceback (most recent call last):
File "/sct/sdcm/tester.py", line 130, in wrapper
return method(*args, **kwargs)
File "/sct/sdcm/tester.py", line 178, in setUp
self.init_resources()
File "/sct/sdcm/tester.py", line 130, in wrapper
return method(*args, **kwargs)
File "/sct/sdcm/tester.py", line 641, in init_resources
monitor_info=monitor_info)
File "/sct/sdcm/tester.py", line 472, in get_cluster_aws
self.db_cluster = create_cluster(db_type)
File "/sct/sdcm/tester.py", line 463, in create_cluster
**cl_params)
File "/sct/sdcm/cluster_aws.py", line 408, in __init__
params=params)
File "/sct/sdcm/cluster.py", line 1524, in __init__
super(BaseScyllaCluster, self).__init__(*args, **kwargs)
File "/sct/sdcm/cluster_aws.py", line 98, in __init__
region_names=self.region_names)
File "/sct/sdcm/cluster.py", line 1343, in __init__
self.add_nodes(num, dc_idx=dc_idx)
File "/sct/sdcm/cluster_aws.py", line 437, in add_nodes
enable_auto_bootstrap=enable_auto_bootstrap)
File "/sct/sdcm/cluster_aws.py", line 219, in add_nodes
instances = self._create_instances(count, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 191, in _create_instances
instances = self._create_spot_instances(count, interfaces, ec2_user_data, dc_idx)
File "/sct/sdcm/cluster_aws.py", line 133, in _create_spot_instances
subnet_info = ec2.get_subnet_info(self._ec2_subnet_id[dc_idx])
File "/sct/sdcm/ec2_client.py", line 314, in get_subnet_info
resp = self._client.describe_subnets(SubnetIds=[subnet_id])
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
raise error_class(parsed_response, operation_name)
ClientError: An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist
ERROR 1-mgmt_cli_test.py:MgmtCliTest.test_mgmt_repair_nemesis -> TestSetupFail: An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist
Error receiving message from test: <type 'exceptions.TypeError'> -> ('__init__() takes exactly 3 arguments (2 given)', <class 'botocore.exceptions.ClientError'>, (u"An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist",))
Reproduced traceback from: /usr/lib/python2.7/site-packages/avocado/core/runner.py:75
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/queues.py", line 376, in get
return recv()
TypeError: ('__init__() takes exactly 3 arguments (2 given)', <class 'botocore.exceptions.ClientError'>, (u"An error occurred (InvalidSubnetID.NotFound) when calling the DescribeSubnets operation: The subnet ID 'subnet-5207ee37' does not exist",))
ERROR 1-mgmt_cli_test.py:MgmtCliTest.test_mgmt_repair_nemesis -> TestAbortedError: Test aborted unexpectedly
cmd line used is:
avocado --show test run mgmt_cli_test.py:MgmtCliTest.test_mgmt_repair_nemesis --multiplex tests/yg_test.yaml --filter-out /run/backends/gce --filter-only /run/backends/aws /run/databases/scylla
tests/yg_test.yaml is:
test_duration: 500
stress_cmd: "cassandra-stress write cl=QUORUM duration=1m -schema 'replication(strategy=NetworkTopologyStrategy,us-eastscylla_node_east=1,us-west-2scylla_node_west=1)' -port jmx=6868 -mode cql3 native -rate threads=100 -pop seq=1..10000"
cassandra_stress_duration: 10080
cassandra_stress_threads: 100
cassandra_stress_population_size: 10000
n_db_nodes: '1 1' # '1 1'
n_loaders: 1 #1
n_monitor_nodes: 1
monitor_branch: 'master' # Testing with latest monitoring for newest manager Dashboards
nemesis_class_name: 'MgmtCli'
nemesis_interval: 5
user_prefix: 'yaron_manager_multidc'
failure_post_behavior: keep
space_node_threshold: 6442
ip_ssh_connections: 'public'
store_results_in_elasticsearch: False
ami_id_db_scylla_desc: '2-3-0'
use_mgmt: true
mgmt_port: 10090
scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/7b02fff5-e4d0-4e4d-ad12-e605ca4873c2/centos/scylladb-2018.1.repo'
#scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/44/scylla-manager.repo'
scylla_mgmt_repo: 'http://downloads.scylladb.com/manager/rpm/unstable/centos/branch-1.3/6/scylla-manager.repo'
scylla_mgmt_upgrade_to_repo: 'http://downloads.scylladb.com/manager/rpm/unstable/centos/branch-1.3/6/scylla-manager.repo'
# Centos Repos:
# scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/7b02fff5-e4d0-4e4d-ad12-e605ca4873c2/centos/scylladb-2018.1.repo'
# scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/f4a2920f80c4bf178217c2553ad65ad7/centos/scylladb-2018.1.repo'
# scylla_mgmt_repo: 'http://downloads.scylladb.com/manager/rpm/unstable/centos/master/218/scylla-manager.repo'
# scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/44/scylla-manager.repo'
#
# Debian Repos:
# scylla_repo_m: http://repositories.scylladb.com/scylla/repo/4bafa2b1-9a0c-4008-a8ad-7f6ef9279e58/debian/scylladb-2017.1-jessie.list
# scylla_mgmt_repo: http://downloads.scylladb.com.s3.amazonaws.com/manager/deb/unstable/jessie/branch-1.2/latest/scylla-manager-1.2/scylla-manager.list
# Ubuntu Repos:
#scylla_repo_m: http://repositories.scylladb.com/scylla/repo/4bafa2b1-9a0c-4008-a8ad-7f6ef9279e58/ubuntu/scylladb-2018.1-xenial.list
#scylla_mgmt_repo: http://downloads.scylladb.com.s3.amazonaws.com/manager/deb/unstable/xenial/branch-1.2/latest/scylla-manager-1.2/scylla-manager.list
#scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.3/1/scylla-manager.repo'
#scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/44/scylla-manager.repo'
#scylla_mgmt_repo: 'MANAGER_REPO_URL'
#es_url:
#es_user:
#es_password:
instance_provision: 'spot_low_price'
backends: !mux
aws: !mux
# What is the backend that the suite will use to get machines from.
cluster_backend: 'aws'
# From 0.19 on, iotune will require bigger disk, so let's use a big
# loader instance by default.
instance_type_loader: 'c4.large'
# Size of AWS monitor instance
instance_type_monitor: i3.large
us_east_1_and_us_west_2:
user_credentials_path: '~/.ssh/scylla-qa-ec2'
region_name: 'us-east-1 us-west-2'
security_group_ids: 'sg-c5e1f7a0 sg-81703ae4'
subnet_id: 'subnet-ec4a72c4 subnet-5207ee37'
ami_id_db_scylla: 'ami-0fc423ac17a75570d ami-0da599147b1d9e80d'
ami_db_scylla_user: 'centos'
ami_id_loader: 'ami-0fc423ac17a75570d'
ami_loader_user: 'centos'
# ami_id_monitor: 'ami-010f2b2749b78a6c5' # scylla-enterprise ami # 'ami-9887c6e7' # Clean CentOs 7 ami 'ami-1c5cc366' # Clean Ubuntu16.4
ami_id_monitor: 'ami-9887c6e7'
ami_monitor_user: 'centos' #'ubuntu' #'centos' #'admin' (for Debian)
gce: !mux
cluster_backend: 'gce'
user_credentials_path: '~/.ssh/scylla-test'
gce_user_credentials: '~/Scylla-c41b78923a54.json'
gce_service_account_email: 'skilled-adapter-452@appspot.gserviceaccount.com'
gce_project: 'skilled-adapter-452'
gce_image: 'https://www.googleapis.com/compute/v1/projects/centos-cloud/global/images/family/centos-7'
gce_image_username: 'scylla-test'
gce_instance_type_db: 'n1-highmem-8'
gce_root_disk_type_db: 'pd-ssd'
gce_root_disk_size_db: 50
gce_n_local_ssd_disk_db: 1
gce_instance_type_loader: 'n1-highcpu-4'
gce_root_disk_type_loader: 'pd-standard'
gce_root_disk_size_loader: 50
gce_n_local_ssd_disk_loader: 0
gce_instance_type_monitor: 'n1-standard-2'
gce_root_disk_type_monitor: 'pd-standard'
gce_root_disk_size_monitor: 50
gce_n_local_ssd_disk_monitor: 0
scylla_repo: https://s3.amazonaws.com/downloads.scylladb.com/rpm/unstable/centos/branch-1.7/37/scylla.repo
#us_east_1:
# gce_datacenter: 'us-east1-b'
multi_dcs:
gce_datacenter: 'us-east1-b us-west1-b us-east4-b'
databases: !mux
scylla:
db_type: scylla
instance_type_db: 'i3.large'
Yaron, can you provide the full job.log?
On Wed, Jan 16, 2019 at 9:52 PM yarongilor notifications@github.com wrote:
cmd line: avocado --show test run mgmt_cli_test.py:MgmtCliTest.test_mgmt_repair_nemesis --multiplex tests/yg_test.yaml --filter-out /run/backends/gce --filter-only /run/backends/aws /run/databases/scylla
test_duration: 500 stress_cmd: "cassandra-stress write cl=QUORUM duration=1m -schema 'replication(strategy=NetworkTopologyStrategy,us-eastscylla_node_east=1,us-west-2scylla_node_west=1)' -port jmx=6868 -mode cql3 native -rate threads=100 -pop seq=1..10000" cassandra_stress_duration: 10080 cassandra_stress_threads: 100 cassandra_stress_population_size: 10000 n_db_nodes: '1 1' # '1 1' n_loaders: 1 #1 n_monitor_nodes: 1 monitor_branch: 'master' # Testing with latest monitoring for newest manager Dashboards nemesis_class_name: 'MgmtCli' nemesis_interval: 5 user_prefix: 'yaron_manager_multidc' failure_post_behavior: keep space_node_threshold: 6442 ip_ssh_connections: 'public' store_results_in_elasticsearch: False ami_id_db_scylla_desc: '2-3-0'
use_mgmt: true mgmt_port: 10090 scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/7b02fff5-e4d0-4e4d-ad12-e605ca4873c2/centos/scylladb-2018.1.repo'
scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/44/scylla-manager.repo'
scylla_mgmt_repo: 'http://downloads.scylladb.com/manager/rpm/unstable/centos/branch-1.3/6/scylla-manager.repo' scylla_mgmt_upgrade_to_repo: 'http://downloads.scylladb.com/manager/rpm/unstable/centos/branch-1.3/6/scylla-manager.repo'
Centos Repos:
scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/7b02fff5-e4d0-4e4d-ad12-e605ca4873c2/centos/scylladb-2018.1.repo'
scylla_repo_m: 'http://repositories.scylladb.com/scylla/repo/f4a2920f80c4bf178217c2553ad65ad7/centos/scylladb-2018.1.repo'
scylla_mgmt_repo: 'http://downloads.scylladb.com/manager/rpm/unstable/centos/master/218/scylla-manager.repo'
scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/44/scylla-manager.repo'
#
Debian Repos:
scylla_repo_m: http://repositories.scylladb.com/scylla/repo/4bafa2b1-9a0c-4008-a8ad-7f6ef9279e58/debian/scylladb-2017.1-jessie.list
scylla_mgmt_repo: http://downloads.scylladb.com.s3.amazonaws.com/manager/deb/unstable/jessie/branch-1.2/latest/scylla-manager-1.2/scylla-manager.list
Ubuntu Repos:
scylla_repo_m: http://repositories.scylladb.com/scylla/repo/4bafa2b1-9a0c-4008-a8ad-7f6ef9279e58/ubuntu/scylladb-2018.1-xenial.list
scylla_mgmt_repo http://repositories.scylladb.com/scylla/repo/4bafa2b1-9a0c-4008-a8ad-7f6ef9279e58/ubuntu/scylladb-2018.1-xenial.list#scylla_mgmt_repo: http://downloads.scylladb.com.s3.amazonaws.com/manager/deb/unstable/xenial/branch-1.2/latest/scylla-manager-1.2/scylla-manager.list
scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.3/1/scylla-manager.repo'
scylla_mgmt_repo: 'http://downloads.scylladb.com.s3.amazonaws.com/manager/rpm/unstable/centos/branch-1.2/44/scylla-manager.repo'
scylla_mgmt_repo: 'MANAGER_REPO_URL'
es_url:
es_user:
es_password:
instance_provision: 'spot_low_price'
backends: !mux aws: !mux
What is the backend that the suite will use to get machines from.
cluster_backend: 'aws' # From 0.19 on, iotune will require bigger disk, so let's use a big # loader instance by default. instance_type_loader: 'c4.large' # Size of AWS monitor instance instance_type_monitor: i3.large us_east_1_and_us_west_2: user_credentials_path: '~/.ssh/scylla-qa-ec2' region_name: 'us-east-1 us-west-2' security_group_ids: 'sg-c5e1f7a0 sg-81703ae4' subnet_id: 'subnet-ec4a72c4 subnet-5207ee37' ami_id_db_scylla: 'ami-0fc423ac17a75570d ami-0da599147b1d9e80d' ami_db_scylla_user: 'centos' ami_id_loader: 'ami-0fc423ac17a75570d' ami_loader_user: 'centos' # ami_id_monitor: 'ami-010f2b2749b78a6c5' # scylla-enterprise ami # 'ami-9887c6e7' # Clean CentOs 7 ami 'ami-1c5cc366' # Clean Ubuntu16.4 ami_id_monitor: 'ami-9887c6e7' ami_monitor_user: 'centos' #'ubuntu' #'centos' #'admin' (for Debian) gce: !mux cluster_backend: 'gce' user_credentials_path: '~/.ssh/scylla-test' gce_user_credentials: '~/Scylla-c41b78923a54.json' gce_service_account_email: 'skilled-adapter-452@appspot.gserviceaccount.com' gce_project: 'skilled-adapter-452' gce_image: 'https://www.googleapis.com/compute/v1/projects/centos-cloud/global/images/family/centos-7' gce_image_username: 'scylla-test' gce_instance_type_db: 'n1-highmem-8' gce_root_disk_type_db: 'pd-ssd' gce_root_disk_size_db: 50 gce_n_local_ssd_disk_db: 1 gce_instance_type_loader: 'n1-highcpu-4' gce_root_disk_type_loader: 'pd-standard' gce_root_disk_size_loader: 50 gce_n_local_ssd_disk_loader: 0 gce_instance_type_monitor: 'n1-standard-2' gce_root_disk_type_monitor: 'pd-standard' gce_root_disk_size_monitor: 50 gce_n_local_ssd_disk_monitor: 0 scylla_repo: https://s3.amazonaws.com/downloads.scylladb.com/rpm/unstable/centos/branch-1.7/37/scylla.repo #us_east_1: # gce_datacenter: 'us-east1-b' multi_dcs: gce_datacenter: 'us-east1-b us-west1-b us-east4-b'
databases: !mux scylla: db_type: scylla instance_type_db: 'i3.large'
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/scylladb/scylla-cluster-tests/issues/671#issuecomment-454785886, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS5zLWdCv-XX1wr1Jeq5H5d5TZYoIJ_ks5vDy6ugaJpZM4X3stX .
@yarongilor what's the job name? can this issue be reproduce every time?
I reproduced this problem locally. Instance of first region (us-east-1) can be created successfully, second region has problem.
After switched the parameter of two region, it firstly tried to created instance for (us-west-2), but failed. So it's problem of aws (related with subnet setup, or ec2 resource available for spot), not problem of sct.
I will try to create a new subnet in us-west-2.
I recreated a new vpc/security group/subnet in us-west2-ip, the result is same as yaron's config.
on_demand : works well spot_fleet & spot_low_price: doesn't work
Tested on us-west-1 & eu-west-1 & us-west-2 & us-east-2
on_demand : works well spot_fleet & spot_low_price: doesn't work
I didn't found useful information from google by InvalidSubnetID.NotFound spot_low_price spot_fleet
And I failed to report case for aws in https://console.aws.amazon.com/support/cases#/create. any suggestion? @roydahan
we need to add more debug, this issue is in our spot code
@yarongilor
Add debug output of the following in the begining of _create_spot_instances
:
I suspect that we create an Ec2 client for the wrong region
subnetid are local to each region you should use different subnet-id for each region,
Of course, but this is not related.
It's configured as:
ami_id_db_scylla: 'ami-0fc423ac17a75570d ami-0da599147b1d9e80d'
and works for on_demand.
@yarongilor, please try to do what Bentsi suggested.
yaml with:
us_east_1_and_us_west_2:
user_credentials_path: '~/.ssh/scylla-qa-ec2'
region_name: 'us-east-1 us-west-2'
security_group_ids: 'sg-c5e1f7a0 sg-81703ae4'
subnet_id: 'subnet-ec4a72c4 subnet-5207ee37'
has output of:
yarongilor@yaron-pc:~/avocado/job-results/latest$ grep '\[dc_idx\] is:' job.log
2019-01-17 16:31:02,507 cluster_aws L0134 DEBUG| self.region_names[dc_idx] is: us-east-1
2019-01-17 16:31:02,508 cluster_aws L0135 DEBUG| self._ec2_subnet_id[dc_idx] is: subnet-ec4a72c4
2019-01-17 16:31:28,928 cluster_aws L0134 DEBUG| self.region_names[dc_idx] is: us-west-2
2019-01-17 16:31:28,928 cluster_aws L0135 DEBUG| self._ec2_subnet_id[dc_idx] is: subnet-5207ee37
thanks
Bentsi notifications@github.com 于 2019年1月18日周五 上午1:36写道:
Closed #671 https://github.com/scylladb/scylla-cluster-tests/issues/671 via 0bfaa1b https://github.com/scylladb/scylla-cluster-tests/commit/0bfaa1bcc18b00cabfe6b7a3c847b261349f3a94 .
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/scylladb/scylla-cluster-tests/issues/671#event-2080914530, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS5zGRc6elMdGyqUKQL5MUbLNBn3b0uks5vELSbgaJpZM4X3stX .
scenario: 1) yaml file has:
2) run sct test setup ==> result: setup fails with: