Closed shahsank3t closed 6 years ago
Is it showing the same behavior of #1104? If then let's merge https://github.com/hortonworks/streamline/pull/1120 to apply https://github.com/hortonworks/registry/pull/350 and try again to see actual error log.
If you can see the actual error in streamline log even now, please share.
@HeartSaVioR I am not really aware of the behavior of #1104 so can't comment on that.
Actual error from streamline log file:
ERROR [11:20:30.616] [dw-3379 - POST /api/v1/catalog/cluster/import/ambari] c.h.s.s.s.GenericExceptionMapper - Got exception: [RuntimeException] / message [java.lang.RuntimeException: com.hortonworks.registries.storage.exception.StorageException: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction] java.lang.RuntimeException: java.lang.RuntimeException: com.hortonworks.registries.storage.exception.StorageException: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:593) at java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:677) at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:735) at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) at com.hortonworks.streamline.streams.cluster.ClusterImporter.handleServices(ClusterImporter.java:74) at com.hortonworks.streamline.streams.cluster.ClusterImporter.lambda$importCluster$0(ClusterImporter.java:65) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582) at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) Caused by: java.lang.RuntimeException: com.hortonworks.registries.storage.exception.StorageException: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction at com.hortonworks.streamline.streams.cluster.ClusterImporter.lambda$handleConfigurations$3(ClusterImporter.java:124) at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1691) at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580) at com.hortonworks.streamline.streams.cluster.ClusterImporter.handleConfigurations(ClusterImporter.java:110) at com.hortonworks.streamline.streams.cluster.ClusterImporter.lambda$handleServices$1(ClusterImporter.java:83) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291) at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731) ... 4 common frames omitted
Sorry I guess it is not related to. Btw I guess you are testing it against VPN or separated cluster: importing Ambari cluster should take some seconds and it shouldn't reach lock wait timeout.
I'm able to reproduce issue. It looks OK when importing same cluster, but kind of stuck when refreshing cluster.
Looks like it is odd transaction issue. I'm not expert on MySQL and I didn't encounter it earlier so may take time to address it.
---TRANSACTION 0, not started
MySQL thread id 60007, OS thread handle 0x7f485a464700, query id 1074574 localhost root
show engine innodb status
---TRANSACTION A899, ACTIVE 32 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1248, 3 row lock(s), undo log entries 2
MySQL thread id 59971, OS thread handle 0x7f485a568700, query id 1074570 <host> <ip> streamline2 update
INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"content":"\\n#!/bin/bash\\n\\n# Set Storm specific environment variables here.\\n\\n# The java implementation to use.\\nexport JAVA_HOME={{java64_home}}\\n\\nexport STORM_CONF_DIR={{conf_dir}}\\nexport STORM_HOME={{storm_component_home_dir}}\\nexport STORM_JAR_JVM_OPTS={{jar_jvm_opts}}\\n\\n#set storm-auto creds\\n# check if storm_jaas.conf in config , only enable storm_auto_creds in secure mode.\\nSTORM_JAAS_CONF=$STORM_HOME/conf/storm_jaas.conf\\nSTORM_AUTOCR
------- TRX HAS BEEN WAITING 32 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 958 n bits 96 index `fk_service_configuration_service` of table `streamline2`.`service_configuration` trx id A899 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
------------------
---TRANSACTION A898, ACTIVE 32 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1248, 3 row lock(s), undo log entries 2
MySQL thread id 59970, OS thread handle 0x7f485a0d6700, query id 1074572 <host> <ip> streamline2 update
INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"ambari_metrics_user":"ams","ams_classpath_additional":"","content":"\\n# Set environment variables here.\\n\\n# AMS instance name\\nexport AMS_INSTANCE_NAME={{hostname}}\\n\\n# The java implementation to use. Java 1.6 required.\\nexport JAVA_HOME={{java64_home}}\\n\\n# Collector Log directory for log4j\\nexport AMS_COLLECTOR_LOG_DIR={{ams_collector_log_dir}}\\n\\n# Monitor Log directory for outfile\\nexport AMS_MONITOR_LOG_DIR={{ams_monitor_log_dir}}\\n\\n# Col
------- TRX HAS BEEN WAITING 32 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 958 n bits 96 index `fk_service_configuration_service` of table `streamline2`.`service_configuration` trx id A898 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
------------------
---TRANSACTION A897, ACTIVE 32 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1248, 3 row lock(s), undo log entries 2
MySQL thread id 59967, OS thread handle 0x7f4870153700, query id 1074569 <host> <ip> streamline2 update
INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"auto.create.topics.enable":"true","auto.leader.rebalance.enable":"true","compression.type":"producer","controlled.shutdown.enable":"true","controlled.shutdown.max.retries":"3","controlled.shutdown.retry.backoff.ms":"5000","controller.message.queue.size":"10","controller.socket.timeout.ms":"30000","default.replication.factor":"1","delete.topic.enable":"false","external.kafka.metrics.exclude.prefix":"kafka.network.RequestMetrics,kafka.server.DelayedOperationPurga
------- TRX HAS BEEN WAITING 32 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 958 n bits 96 index `fk_service_configuration_service` of table `streamline2`.`service_configuration` trx id A897 lock_mode X insert intention waiting
Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
0: len 8; hex 73757072656d756d; asc supremum;;
https://dev.mysql.com/doc/refman/5.7/en/innodb-locking.html#innodb-insert-intention-locks
SELECT straight_join
w.trx_mysql_thread_id waiting_thread,
w.trx_id waiting_trx_id,
w.trx_query waiting_query,
b.trx_mysql_thread_id blocking_thread,
b.trx_id blocking_trx_id,
b.trx_query blocking_query,
bl.lock_id blocking_lock_id,
bl.lock_mode blocking_lock_mode,
bl.lock_type blocking_lock_type,
bl.lock_table blocking_lock_table,
bl.lock_index blocking_lock_index,
wl.lock_id waiting_lock_id,
wl.lock_mode waiting_lock_mode,
wl.lock_type waiting_lock_type,
wl.lock_table waiting_lock_table,
wl.lock_index waiting_lock_index
FROM
information_schema.INNODB_LOCK_WAITS ilw ,
information_schema.INNODB_TRX b ,
information_schema.INNODB_TRX w ,
information_schema.INNODB_LOCKS bl ,
information_schema.INNODB_LOCKS wl
WHERE
b.trx_id = ilw.blocking_trx_id
AND w.trx_id = ilw.requesting_trx_id
AND bl.lock_id = ilw.blocking_lock_id
AND wl.lock_id = ilw.requested_lock_id;
Similar result in show engine innodb status;
is shown:
+----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+-----------------+----------------+------------------+--------------------+--------------------+---------------------------------------+------------------------------------+-----------------+-------------------+-------------------+---------------------------------------+------------------------------------+
| waiting_thread | waiting_trx_id | waiting_query | blocking_thread | blocking_trx_id | blocking_query | blocking_lock_id | blocking_lock_mode | blocking_lock_type | blocking_lock_table | blocking_lock_index | waiting_lock_id | waiting_lock_mode | waiting_lock_type | waiting_lock_table | waiting_lock_index |
+----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+-----------------+----------------+------------------+--------------------+--------------------+---------------------------------------+------------------------------------+-----------------+-------------------+-------------------+---------------------------------------+------------------------------------+
| 60069 | A8A5 | INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"ambari_metrics_user":"ams","ams_classpath_additional":"","content":"\\n# Set environment variables here.\\n\\n# AMS instance name\\nexport AMS_INSTANCE_NAME={{hostname}}\\n\\n# The java implementation to use. Java 1.6 required.\\nexport JAVA_HOME={{java64_home}}\\n\\n# Collector Log directory for log4j\\nexport AMS_COLLECTOR_LOG_DIR={{ams_collector_log_dir}}\\n\\n# Monitor Log directory for outfile\\nexport AMS_MONITOR_LOG_DIR={{ams_monitor_log_dir}}\\n\\n# Collector pid directory\\nexport AMS_COLLECTOR_PID_DIR={{ams_collector_pid_dir}}\\n\\n# Monitor pid directory\\nexport AMS_MONITOR_PID_DIR={{ams_monitor_pid_dir}}\\n\\n# AMS HBase pid directory\\nexport AMS_HBASE_PID_DIR={{hbase_pid_dir}}\\n\\n# AMS Collector heapsize\\nexport AMS_COLLECTOR_HEAPSIZE={{metrics_collector_heapsize}}\\n\\n# HBase Tables Initialization check enabled\\nexport AMS_HBASE_INIT_CHECK_ENABLED={{ams_hb | 60065 | A8A0 | NULL | A8A0:0:958:1 | S | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` | A8A5:0:958:1 | X | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` |
| 60070 | A8A4 | INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"auto.create.topics.enable":"true","auto.leader.rebalance.enable":"true","compression.type":"producer","controlled.shutdown.enable":"true","controlled.shutdown.max.retries":"3","controlled.shutdown.retry.backoff.ms":"5000","controller.message.queue.size":"10","controller.socket.timeout.ms":"30000","default.replication.factor":"1","delete.topic.enable":"false","external.kafka.metrics.exclude.prefix":"kafka.network.RequestMetrics,kafka.server.DelayedOperationPurgatory,kafka.server.BrokerTopicMetrics.BytesRejectedPerSec,kafka.server.KafkaServer.ClusterId","external.kafka.metrics.include.prefix":"kafka.network.RequestMetrics.ResponseQueueTimeMs.request.OffsetCommit.98percentile,kafka.network.RequestMetrics.ResponseQueueTimeMs.request.Offsets.95percentile,kafka.network.RequestMetrics.ResponseSendTimeMs.request.Fetch.95percentile,kafka.network.RequestMetrics.RequestsPerSec.request"," | 60065 | A8A0 | NULL | A8A0:0:958:1 | S | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` | A8A4:0:958:1 | X | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` |
| 60066 | A8A3 | INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"content":"\\n#!/bin/bash\\n\\n# Set Storm specific environment variables here.\\n\\n# The java implementation to use.\\nexport JAVA_HOME={{java64_home}}\\n\\nexport STORM_CONF_DIR={{conf_dir}}\\nexport STORM_HOME={{storm_component_home_dir}}\\nexport STORM_JAR_JVM_OPTS={{jar_jvm_opts}}\\n\\n#set storm-auto creds\\n# check if storm_jaas.conf in config , only enable storm_auto_creds in secure mode.\\nSTORM_JAAS_CONF=$STORM_HOME/conf/storm_jaas.conf\\nSTORM_AUTOCREDS_LIB_DIR=$STORM_HOME/external/storm-autocreds\\n\\nif [ -f $STORM_JAAS_CONF ] && [ -d $STORM_AUTOCREDS_LIB_DIR ]; then\\n export STORM_EXT_CLASSPATH=$STORM_AUTOCREDS_LIB_DIR\\nfi","jmxremote_port":"56431","nimbus_seeds_supported":"true","storm.atlas.hook":"false","storm_log_dir":"/var/log/storm","storm_logs_supported":"true","storm_pid_dir":"/var/run/storm","storm_user":"storm","storm_user_nofile_limit":"128000","s | 60065 | A8A0 | NULL | A8A0:0:958:1 | S | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` | A8A3:0:958:1 | X | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` |
| 60067 | A8A2 | INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"content":"#!/bin/bash\\n# Licensed to the Apache Software Foundation (ASF) under one or more\\n# contributor license agreements. See the NOTICE file distributed with\\n# this work for additional information regarding copyright ownership.\\n# The ASF licenses this file to You under the Apache License, Version 2.0\\n# (the \\"License\\"); you may not use this file except in compliance with\\n# the License. You may obtain a copy of the License at\\n#\\n# http://www.apache.org/licenses/LICENSE-2.0\\n#\\n# Unless required by applicable law or agreed to in writing, software\\n# distributed under the License is distributed on an \\"AS IS\\" BASIS,\\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\\n# See the License for the specific language governing permissions and\\n# limitations under the License.\\n\\n# By default the script will use JAVA_HOME to deter | 60065 | A8A0 | NULL | A8A0:0:958:1 | S | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` | A8A2:0:958:1 | X | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` |
| 60068 | A8A1 | INSERT INTO service_configuration (`filename`, `configuration`, `name`, `description`, `id`, `serviceId`, `timestamp`) VALUES( null,'{"autopurge.purgeInterval":"24","autopurge.snapRetainCount":"30","clientPort":"2181","dataDir":"/hadoop/zookeeper","initLimit":"10","syncLimit":"5","tickTime":"3000"}','zoo.cfg','',null,124,1515057252254) | 60065 | A8A0 | NULL | A8A0:0:958:1 | S | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` | A8A1:0:958:1 | X | RECORD | `streamline2`.`service_configuration` | `fk_service_configuration_service` |
+----------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+-----------------+----------------+------------------+--------------------+--------------------+---------------------------------------+------------------------------------+-----------------+-------------------+-------------------+---------------------------------------+------------------------------------+
I guess I found the reason. Transaction which runs delete queries used for cleaning up before importing blocks inserting. Don't know about the detail though: why it blocks, and why the lock is presenting on FK index.
For PoC, I manually deleted the rows in service/service_configuration/component/component_process regarding cluster before refreshing, and do refresh: it works. (though it shows no services in UI... which is originated from backend API result. I have encountered some DAO cache inconsistent issue when I touch the DB manually so suspecting that, but it may be intended limitation, and after refreshing it shows properly so I don't mind for now)
Maybe we should make the logic sequential (so that it can be done within "a" transaction), or separate range of transaction to two places: cleaning up / importing.