irods / irods_capability_storage_tiering

BSD 3-Clause "New" or "Revised" License
5 stars 10 forks source link

Failure to schedule migration due to change in rule-enqueueing interface #165

Closed alanking closed 3 years ago

alanking commented 3 years ago

When running the test suite, I noticed that tests were failing with the following messages in the log:

May 24 11:45:55 pid:21067 remote addresses: x.x.x.x, x.x.x.x ERROR: rsExecMyRule : -66000, [-]  /irods_git_repo/plugins/rule_engines/irods_rule_engine_plugin-cpp_default_policy/libirods_rule_engine_plugin-cpp_default_policy.cpp:1127:irods::error exec_rule_text(irods::default_re_ctx &, const std::string      &, msParamArray_t *, const std::string &, irods::callback) :  status [SYS_NOT_SUPPORTED]  errno [] -- message [exec_rule_text : policy to invoke is empty]

...one for each failed migration schedule. And then...


May 24 11:46:02 pid:21059 remote addresses: x.x.x.x, x.x.x.x ERROR: data movement scheduling failed - [-66000]::[iRODS Exception:
    file: /irods_plugin/storage_tiering.cpp
    function: void irods::storage_tiering::queue_data_movement(rcComm_t *, const std::string &, const std::string &, const std::string &, const std::string &, const std::string &, const std::string &, const std::string &, const std::string &, const bool, const std::string &)
    line: 713
    code: -66000 (SYS_NOT_SUPPORTED)
    message:
        queue data movement failed for object [/tempZone/home/rods/test_put_gt_max_sql_rows/junk0000] from [ufs0] to [ufs1]
stack trace:
--------------

Dumping stack trace
<0> Offset: 0x6e    Address: 0x7f5d2586bb8e irods::exception::exception(long, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) 
<1> Offset: 0x42    Address: 0x7f5d2586bd52 irods::exception::exception(long, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) 
<2> Offset: 0x1208  Address: 0x7f5d1f9a46c8 irods::storage_tiering::queue_data_movement(RcComm*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
<3> Offset:         Address: 0x7f5d1f9a7edc /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so(+0x7aedc) [0x7f5d1f9a7edc]
<4> Offset: 0x26    Address: 0x7f5d1f9b17c6 irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()::operator()()
<5> Offset: 0x49    Address: 0x7f5d1f9b1709 void boost::asio::system_executor::dispatch<irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'(), std::__1::allocator<void> >(irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()&&, std::__1::allocator<void> const&) const
<6> Offset: 0x19    Address: 0x7f5d1f9b16a9 void boost::asio::asio_handler_invoke<boost::asio::detail::work_dispatcher<irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()> >(boost::asio::detail::work_dispatcher<irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()>&, ...)
<7> Offset: 0x87    Address: 0x7f5d1f9b1457 boost::asio::detail::executor_op<boost::asio::detail::work_dispatcher<irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()>, std::__1::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long)
<8> Offset: 0x307   Address: 0x7f5d24dcb217 boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&)
<9> Offset: 0x171   Address: 0x7f5d24dcad21 boost::asio::detail::scheduler::run(boost::system::error_code&)
<10>    Offset: 0x25    Address: 0x7f5d24dcaba5 boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run()
<11>    Offset:         Address: 0x7f5d24dcab2d /usr/lib/libirods_server.so.4.2.9(boost_asio_detail_posix_thread_function+0xd) [0x7f5d24dcab2d]
<12>    Offset:         Address: 0x7f5d226b56ba /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f5d226b56ba]
<13>    Offset:         Address: 0x7f5d223eb4dd /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f5d223eb4dd]

]

...again, one for each failure to schedule a migration. The overall migration scheduling (that is, enqueueing the scheduling) also fails:

May 24 11:46:02 pid:21059 remote addresses: x.x.x.x, x.x.x.x ERROR: iRODS Exception:
    file: /irods_plugin/storage_tiering.cpp
    function: void irods::storage_tiering::migrate_violating_data_objects(rcComm_t *, const std::string &, const std::string &, const std::string &, const std::string &)
    line: 614
    code: -35000 (SYS_INVALID_OPR_TYPE)
    message:
        scheduling failed for [257] objects for query [SELECT DATA_NAME, COLL_NAME, USER_NAME, DATA_REPL_NUM WHERE META_DATA_ATTR_NAME = 'irods::access_time' AND META_DATA_ATTR_VALUE < '1621871150' AND META_DATA_ATTR_UNITS <> 'irods::storage_tiering::migration_scheduled' AND DATA_RESC_ID IN ('10033',)]
stack trace:
--------------

Dumping stack trace
<0> Offset: 0x6e    Address: 0x7f5d2586bb8e irods::exception::exception(long, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
<1> Offset: 0x42    Address: 0x7f5d2586bd52 irods::exception::exception(long, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
<2> Offset: 0x5c0   Address: 0x7f5d1f9a2bc0 irods::storage_tiering::migrate_violating_data_objects(RcComm*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
<3> Offset: 0xc3    Address: 0x7f5d1f9a6eb3 irods::storage_tiering::apply_policy_for_tier_group(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&)
<4> Offset: 0x99f   Address: 0x7f5d1f97883f exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)
<5> Offset: 0x2f    Address: 0x7f5d1f99cc7f std::__1::__function::__func<irods::error (*)(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback), std::__1::allocator<irods::error (*)(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)>, irods::error (std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)>::operator()(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*&&, irods::callback&&)
<6> Offset: 0xd8    Address: 0x7f5d247f22f8 irods::pluggable_rule_engine<std::__1::tuple<> >::exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)
<7> Offset: 0x459   Address: 0x7f5d247f1ec9 irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>::exec_rule_expression(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*)
<8> Offset: 0x282   Address: 0x7f5d247f18d2 rsExecRuleExpression(RsComm*, ExecRuleExpression*)
<9> Offset: 0x42    Address: 0x7f5d24b14272 irods::api_call_adaptor<ExecRuleExpression*>::operator()(irods::plugin_context&, RsComm*, ExecRuleExpression*)
<10>    Offset: 0x13    Address: 0x7f5d24b141f3 std::__1::__function::__func<irods::api_call_adaptor<ExecRuleExpression*>, std::__1::allocator<irods::api_call_adaptor<ExecRuleExpression*> >, irods::error (irods::plugin_context&, RsComm*, ExecRuleExpression*)>::operator()(irods::plugin_context&, RsComm*&&, ExecRuleExpression*&&)
<11>    Offset: 0x13d2  Address: 0x7f5d249bcf62 int irods::api_entry::call_handler<ExecRuleExpression*>(RsComm*, ExecRuleExpression*)
<12>    Offset: 0x763   Address: 0x7f5d24bc5363 rsApiHandler(RsComm*, int, BytesBuf*, BytesBuf*)
<13>    Offset: 0xa4b   Address: 0x7f5d24bc740b readAndProcClientMsg(RsComm*, int)
<14>    Offset: 0xd08   Address: 0x7f5d24bb8468 agentMain(RsComm*)
<15>    Offset: 0x22b0  Address: 0x7f5d24bb6470 runIrodsAgentFactory(sockaddr_un)
<16>    Offset:         Address: 0x42cee8   /usr/sbin/irodsServer(main+0x1708) [0x42cee8]
<17>    Offset:         Address: 0x7f5d22304840 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f5d22304840]
<18>    Offset:         Address: 0x42b5e9   /usr/sbin/irodsServer(_start+0x29) [0x42b5e9]

It seems that the irods_policy_enqueue_rule interface being used is expecting policy_to_invoke and parameters keywords, which are currently being passed as policy and payload. This seems to fix the tests.

However, even though storage tiering appears to work, there is a lot of noise in the log to the effect of:

May 25 18:10:22 pid:14798 DEBUG: error: syntax error
line 0, col 31
{{"destination-resource":"ufs1","group-name":"example_group","md5":"86ebf1e577db63c0967f16b86919669a","object-path":"/tempZone/home/rods/test_put_gt_max_sql_rows/junk0002","preserve-replicas":false,"rule-engine-instance-name":"irods_rule_engine_plugin-unified_storage_tiering-instance","rule-engine-operation":"irods_policy_data_movement","source-replica-number":"0","source-resource":"ufs0","user-name":"rods","verification-type":"catalog"}}
                               ^

I suspect this is due to the parameters being passed to irods_policy_enqueue_rule not being propagated and/or acknowledged properly when execution time comes. Here is the construction of the rule: https://github.com/irods/irods_capability_storage_tiering/blob/5ada70586fd6075bd16eda2a9a64f2515ec47a82/storage_tiering.cpp#L677-L695

As you can see, the rule-engine-instance-name is being included here, but I think it may be getting dropped.

This only applies to 4-2-stable at this time as this only affects the unified storage tiering plugin and master branch is missing some changes made to support the unified storage tiering plugin.