Open MageMasher opened 6 years ago
Thank you for reporting. I'll look into this in this weekend.
Whats interesting is that this may be related to the dynamodb throughput with the solo-topology. If you set up a solo-topology, look at the cloudwatch alarms, you may find something there.
I created https://github.com/signifier-jp/onyx-datomic-cloud-ci to run the tests triggered by commits. It runs as a stand-alone or a docker container. As a first step, I ran it from my local via socks proxy, and reproduced the error on 2nd round. I also confirmed the following alert.
State Details:The current state of the alarm: OK, ALARM or INSUFFICIENT_DATA. Includes information about when it entered this state and why.State changed to ALARM at 2018/04/12. Reason: Threshold Crossed: 7 datapoints were less than the threshold (150.0). The most recent datapoints which crossed the threshold: [23.0 (12/04/18 04:21:00), 16.0 (12/04/18 04:19:00), 1.0 (12/04/18 04:18:00), 18.0 (12/04/18 04:17:00), 36.0 (12/04/18 04:16:00)].
--
Description:The description provided when the alarm was created or modified.DO NOT EDIT OR DELETE. For TargetTrackingScaling policy arn:aws:autoscaling:us-west-2:NNNNNNNNNNN:scalingPolicy:af7b460a-4967-45ee-a01a-d746352bdbc4:resource/dynamodb/table/datomic-signifier-dev:policyName/datomic-signifier-dev-write-scaling-policy.
Threshold:The condition in which the alarm will go to the ALARM state.ConsumedWriteCapacityUnits < 150 for 15 datapoints within 15 minutes
Actions:The actions that will occur when the alarm changes state.In ALARM:arn:aws:autoscaling:us-west-2:NNNNNNNNNNNN:scalingPolicy:af7b460a-4967-45ee-a01a-d746352bdbc4:resource/dynamodb/table/datomic-signifier-dev:policyName/datomic-signifier-dev-write-scaling-policy | In ALARM: | arn:aws:autoscaling:us-west-2:NNNNNNNNNN:scalingPolicy:af7b460a-4967-45ee-a01a-d746352bdbc4:resource/dynamodb/table/datomic-signifier-dev:policyName/datomic-signifier-dev-write-scaling-policy
In ALARM: | arn:aws:autoscaling:us-west-2:NNNNNNNNNNN:scalingPolicy:af7b460a-4967-45ee-a01a-d746352bdbc4:resource/dynamodb/table/datomic-signifier-dev:policyName/datomic-signifier-dev-write-scaling-policy
Namespace:A namespace is a conceptual element to which you attach one or more metrics.AWS/DynamoDB
Metric Name:The metric name field of the metric being monitored.ConsumedWriteCapacityUnits
Dimensions:Name/Value pairs used to identify a metric uniquely. Often indicates the resource being monitored.TableName = datomic-signifier-dev
Statistic:Statistic of the metric being monitored, either Average, Minimum, Maximum, Sum or number of Data SamplesSum
Period:The granularity of the datapoints for the monitored metric.1 minute
Treat missing data as:This option will be applied when the metric data is missing for alarm evaluation.missing
Percentiles with low samples:This option will be applied when the percentile has low data samples for alarm evaluation.evaluate
I tried to bump up read-capacity 250, write-capacity 125, but the test still failed. Interestingly, the write capacity graph in metrics tab doesn't show any high usage while I received the error via CloudWatch.
I will try different configurations such as running the test in the peered VPC so that I can connect without socks proxy, etc.
I was able to reproduce the failure with the below command, running it once does not always cause a failure. Also, running only
lein test :only onyx.plugin.tx-async-output-test/datomic-tx-output-test
does NOT reproduce the failure.Here is an output containing the error: