Open RongQiao opened 6 years ago
@rpoluri do you understand the request here? Is this some part of the underlying Hive metastore DB schema that for some reason we haven't activated?
It's an extra component of mysql (seems like also in aurora https://aws.amazon.com/blogs/database/analyze-amazon-aurora-mysql-workloads-with-performance-insights/) that gives you more stats on who and what is abusing the database when you have performance problems. https://dev.mysql.com/doc/refman/8.0/en/performance-schema.html It's one of those things you don't need until the database is breaking, but I think it also has some nontrivial performance impact itself too.
Wouldn't the owners of the RDS be able to set that up themselves then? i.e. it doesn't need to be part of Apiary? Feels more like a MySQL/Aurora DB admin task?
We manage RDS part of Apiary Data Lake. May be this is corresponding option in RDS, https://www.terraform.io/docs/providers/aws/r/rds_cluster_instance.html#performance_insights_enabled will check and close issue accordingly.
I am from Expedia Data Engineering team. We have some AWS mysql RDS served as Hive Meta store, and we suffered some metastore performance issues in the past when some users run 'alter table recover partitions' etc on big dataset. Without the performance_schema, we don't have much insights about what's going on. The apiary metastore has more complicated use cases, so I would suggest that the performance_schema is available for apiary meta store.