Open kript opened 2 years ago
Here's the current logic... @d-w-moore maybe we need a paragraph in the README.md explaining with an example...
@kript is it possible there is more than one indexing plugin (as distinguished by the instance name) active in the zone in question?
@d-w-moore that's a horrifying thought... but I don't think so!
"rule_engines": [
{
"instance_name": "irods_rule_engine_plugin-indexing-instance",
"plugin_name": "irods_rule_engine_plugin-indexing",
"plugin_specific_configuration": {}
},
{
"instance_name": "irods_rule_engine_plugin-elasticsearch-instance",
"plugin_name": "irods_rule_engine_plugin-elasticsearch",
"plugin_specific_configuration": {
"hosts": [
"http://user:pass@elasticsearch:19200/"
],
"bulk_count": 100,
"read_size": 4194304,
"job_limit_per_collection_indexing_operation": "500"
}
},
{
"instance_name": "irods_rule_engine_plugin-document_type-instance",
"plugin_name": "irods_rule_engine_plugin-document_type",
"plugin_specific_configuration": {}
},
{
"instance_name": "irods_rule_engine_plugin-storage_tiering-instance",
"plugin_name": "irods_rule_engine_plugin-storage_tiering",
"plugin_specific_configuration": {
"access_time_attribute": "irods::access_time",
"group_attribute": "irods::storage_tiering::group",
"time_attribute": "irods::storage_tiering::time",
"query_attribute": "irods::storage_tiering::query",
"verification_attribute": "irods::storage_tiering::verification",
"data_movement_parameters_attribute": "irods::storage_tiering::restage_delay",
"minimum_restage_tier": "irods::storage_tiering::minimum_restage_tier",
"preserve_replicas": "irods::storage_tiering::preserve_replicas",
"object_limit": "irods::storage_tiering::object_limit",
"default_data_movement_parameters": "<EF>60s DOUBLE UNTIL SUCCESS OR 5 TIMES</EF>",
"minumum_delay_time": "irods::storage_tiering::minimum_delay_time_in_seconds",
"maximum_delay_time": "irods::storage_tiering::maximum_delay_time_in_seconds",
"time_check_string": "TIME_CHECK_STRING",
"data_transfer_log_level": "LOG_NOTICE"
}
},
{
"instance_name": "irods_rule_engine_plugin-apply_access_time-instance",
"plugin_name": "irods_rule_engine_plugin-apply_access_time",
"plugin_specific_configuration": {}
},
{
"instance_name": "irods_rule_engine_plugin-data_verification-instance",
"plugin_name": "irods_rule_engine_plugin-data_verification",
"plugin_specific_configuration": {}
},
{
"instance_name": "irods_rule_engine_plugin-data_replication-instance",
"plugin_name": "irods_rule_engine_plugin-data_replication",
"plugin_specific_configuration": {}
},
{
"instance_name": "irods_rule_engine_plugin-data_movement-instance",
"plugin_name": "irods_rule_engine_plugin-data_movement",
"plugin_specific_configuration": {}
},
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
"plugin_specific_configuration": {
"re_data_variable_mapping_set": [
"core"
],
"re_function_name_mapping_set": [
"core"
],
"re_rulebase_set": [
"seq",
"core"
],
"regexes_for_supported_peps": [
"ac[^ ]*",
"msi[^ ]*",
"[^ ]*pep_[^ ]*_(pre|post)"
]
},
"shared_memory_instance": "upgraded_irods_rule_language_rule_engine"
},
{
"instance_name": "irods_rule_engine_plugin-cpp_default_policy-instance",
"plugin_name": "irods_rule_engine_plugin-cpp_default_policy",
"plugin_specific_configuration": {}
}
]
},
"rule_engine_namespaces": [
"",
"indexing_"
],
That is the server config on the provider we have designated as a delay server. I've just chcked the other two and they have;
"rule_engines": [
{
"instance_name": "irods_rule_engine_plugin-storage_tiering-instance",
"plugin_name": "irods_rule_engine_plugin-storage_tiering",
"plugin_specific_configuration": {
"access_time_attribute" : "irods::access_time",
"group_attribute" : "irods::storage_tiering::group",
"time_attribute" : "irods::storage_tiering::time",
"query_attribute" : "irods::storage_tiering::query",
"verification_attribute" : "irods::storage_tiering::verification",
"data_movement_parameters_attribute" : "irods::storage_tiering::restage_delay",
"minimum_restage_tier" : "irods::storage_tiering::minimum_restage_tier",
"preserve_replicas" : "irods::storage_tiering::preserve_replicas",
"object_limit" : "irods::storage_tiering::object_limit",
"default_data_movement_parameters" : "<EF>60s DOUBLE UNTIL SUCCESS OR 5 TIMES</EF>",
"minumum_delay_time" : "irods::storage_tiering::minimum_delay_time_in_seconds",
"maximum_delay_time" : "irods::storage_tiering::maximum_delay_time_in_seconds",
"time_check_string" : "TIME_CHECK_STRING",
"data_transfer_log_level" : "LOG_NOTICE"
}
},
{
"instance_name": "irods_rule_engine_plugin-apply_access_time-instance",
"plugin_name": "irods_rule_engine_plugin-apply_access_time",
"plugin_specific_configuration": {
}
},
{
"instance_name": "irods_rule_engine_plugin-data_verification-instance",
"plugin_name": "irods_rule_engine_plugin-data_verification",
"plugin_specific_configuration": {
}
},
{
"instance_name": "irods_rule_engine_plugin-data_replication-instance",
"plugin_name": "irods_rule_engine_plugin-data_replication",
"plugin_specific_configuration": {
}
},
{
"instance_name": "irods_rule_engine_plugin-data_movement-instance",
"plugin_name": "irods_rule_engine_plugin-data_movement",
"plugin_specific_configuration": {
}
},
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
"plugin_specific_configuration": {
"re_data_variable_mapping_set": [
"core"
],
"re_function_name_mapping_set": [
"core"
],
"re_rulebase_set": [
"seq",
"core"
],
"regexes_for_supported_peps": [
"ac[^ ]*",
"msi[^ ]*",
"[^ ]*pep_[^ ]*_(pre|post)"
]
},
"shared_memory_instance": "upgraded_irods_rule_language_rule_engine"
},
{
"instance_name": "irods_rule_engine_plugin-cpp_default_policy-instance",
"plugin_name": "irods_rule_engine_plugin-cpp_default_policy",
"plugin_specific_configuration": {
}
}
]
},
"rule_engine_namespaces": [
""
],
@kript thanks for the forensic evidence : ) I'll check this out today on my end.
@kript - I gave it a couple of runs at the throttle limit of 500 today, and used the iquest
/grep
commands you posted above, but never saw the number of DB connections go above 501, nor did the # of jobs exceed 501 as determined by the command
iquest --no-page '%s' "select RULE_EXEC_NAME where RULE_EXEC_NAME like '%/TESTCOL/%' " | grep -E 'job-category-tag":"[0-9]+-[0-9]+' | wc -l
(Where TESTCOL
is the name of the AVU-annotated top level collection). Btw the like
-clause and the job-category-tag
grep in the pipeline is a good formula for making sure the jobs you're including into your count do, indeed, belong to the indexing plugin.)
@korydraughn and I also considered the possibility - after looking into the 4.2.7 irodsReServer
source code - that under some conditions, the connection pools to the DB might build up in memory, especially with more delayed task requests coming in than the number of threads on the provider can deal with at a time. That seems a likely possibility, if the issue you've recorded here is something you've dependably reproduced(ie, more than once with similar results).
@kript Let me know if you'd like to set up a call to look into it further. I'm pretty flexible this week.
Installed
irods-rule-engine-plugin-indexing
version 4.2.7.1Set the limit to 500
Add a collection which has well over that in items of metadata;
Observe the delay server queue jump to 4.7k.
As I understood it it should have no more than 500 rules in the queue at a time? Or have I misunderstood?