irods / irods_capability_storage_tiering

BSD 3-Clause "New" or "Revised" License
5 stars 10 forks source link

Implement storage tiering for data from remote zones #228

Closed cookie33 closed 8 months ago

cookie33 commented 8 months ago

Hi, This is a first try at adapting using storage tiering for data from remote sites. The file "exec_as_user.hpp" has:

        rodsLog(
            LOG_ERROR,
                "executing as user [%s] fom zone [%s]",
                user.userName,
                user.rodsZone);

It shows that the correct user and remote zone is being used after the modifications:

{"log_category":"legacy","log_level":"error","log_message":"executing as user [robertv] fom zone [igor]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":22550,"server_timestamp":"2023-11-21T10:35:09.037Z","server_type":"agent","server_zone":"frank"}

But it is not executed as the correct user/remote zone. The following functions still fail.

{"log_category":"legacy","log_level":"error","log_message":"data movement scheduling failed - [-808000]::[iRODS Exception:\n    file: /home/robertv/git/irods_capability_storage_tiering/storage_tiering.cpp\n    function: std::string irods::storage_tiering::get_metadata_for_data_object(rcComm_t *, const std::string &, const std::string &)\n    line: 92\n    code: -808000 (CAT_NO_ROWS_FOUND)\n    message:\n        no results found for object [/frank/home/robertv#igor/test_20231121_01.txt] with attribute [irods::access_time]\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()::operator()() in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 4# boost::asio::detail::executor_op<boost::asio::detail::binder0<irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()>, std::__1::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 5# boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) in /lib/libirods_server.so.4.3.1\n 6# boost::asio::detail::scheduler::run(boost::system::error_code&) in /lib/libirods_server.so.4.3.1\n 7# boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() in /lib/libirods_server.so.4.3.1\n 8# boost_asio_detail_posix_thread_function in /lib/libirods_server.so.4.3.1\n 9# 0x00007FC191624EA5 in /lib64/libpthread.so.0\n10# clone in /lib64/libc.so.6\n\n]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":22550,"server_timestamp":"2023-11-21T10:35:09.095Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"iRODS Exception:\n    file: /home/robertv/git/irods_capability_storage_tiering/storage_tiering.cpp\n    function: void irods::storage_tiering::migrate_violating_data_objects(rcComm_t *, const std::string &, const std::string &, const std::string &, const std::string &)\n    line: 616\n    code: -35000 (SYS_INVALID_OPR_TYPE)\n    message:\n        scheduling failed for [1] objects for query [SELECT DATA_NAME, COLL_NAME, USER_NAME, USER_ZONE, DATA_REPL_NUM WHERE META_DATA_ATTR_NAME = 'irods::access_time' AND META_DATA_ATTR_VALUE < '1700562788' AND META_DATA_ATTR_UNITS <> 'irods::storage_tiering::migration_scheduled' AND DATA_RESC_ID IN ('10002',)]\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# irods::log(irods::exception const&) in /lib/libirods_common.so.4.3.1\n 4# irods::storage_tiering::migrate_violating_data_objects(RcComm*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 5# irods::storage_tiering::apply_policy_for_tier_group(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 6# exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 7# std::__1::__function::__func<irods::error (*)(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback), std::__1::allocator<irods::error (*)(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)>, irods::error (std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)>::operator()(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*&&, irods::callback&&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 8# irods::pluggable_rule_engine<std::__1::tuple<> >::exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback) in /lib/libirods_server.so.4.3.1\n 9# irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>::exec_rule_expression(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*) in /lib/libirods_server.so.4.3.1\n10# rsExecRuleExpression(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n11# irods::api_call_adaptor<ExecRuleExpression*>::operator()(irods::plugin_context&, RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n12# std::__1::__function::__func<irods::api_call_adaptor<ExecRuleExpression*>, std::__1::allocator<irods::api_call_adaptor<ExecRuleExpression*> >, irods::error (irods::plugin_context&, RsComm*, ExecRuleExpression*)>::operator()(irods::plugin_context&, RsComm*&&, ExecRuleExpression*&&) in /lib/libirods_server.so.4.3.1\n13# int irods::api_entry::call_handler<ExecRuleExpression*>(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n14# rsApiHandler(RsComm*, int, BytesBuf*, BytesBuf*) in /lib/libirods_server.so.4.3.1\n15# readAndProcClientMsg(RsComm*, int) in /lib/libirods_server.so.4.3.1\n16# agentMain(RsComm*) in /lib/libirods_server.so.4.3.1\n17# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n18# main::$_5::operator()() const in /usr/sbin/irodsServer\n19# main in /usr/sbin/irodsServer\n20# __libc_start_main in /lib64/libc.so.6\n21# _start in /usr/sbin/irodsServer\n\n","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":22550,"server_timestamp":"2023-11-21T10:35:09.198Z","server_type":"agent","server_zone":"frank"}

It still functions for files created by local users in the zone.

What did we mis? Can you give us a pointer?

Greetings

cookie33 commented 8 months ago

We think it uses the same mechanism as a iRODS admin user can do:

export clientUserName=<USERNAME>
export clientRodsZone=<remoteZone>

And then the iRODS admin impersonates the user from the remote zone.

Is that correct?

korydraughn commented 8 months ago

It's possible you're triggering a redirection between servers. If that's true, exec_as_user is not enough.

To get around that, either a new connection must be made as the target user or rc_switch_user (4.3.1 only) must be used.

cookie33 commented 8 months ago

How can we check if we are triggering a redirection between servers? Is there some logging we can do to see it?

korydraughn commented 8 months ago

The only way to detect that is to view the logs of other servers in the zone, specifically storage resource servers and the provider.

If you're running iRODS 4.3.1, configuring syslog for the servers to write to a centralized location will help make tracking API redirection easier.

cookie33 commented 8 months ago

Hi, I am testing it on a single node iRODS instance. And all info is going to the same logfile. An example is as follows:

put a file from a remote zone igor

[irodstest2]:~
robertv$ date ; iput .bashrc /frank/home/robertv#igor/test_20231122_10.txt -R eudatCache
Wed Nov 22 11:04:40 CET 2023

[irodstest2]:~
robertv$ ils /frank/home/robertv#igor/test_20231122_10.txt
  /frank/home/robertv#igor/test_20231122_10.txt

[irodstest2]:~
robertv$ imeta ls -d /frank/home/robertv#igor/test_20231122_10.txt
AVUs defined for dataObj /frank/home/robertv#igor/test_20231122_10.txt:
attribute: irods::access_time
value: 1700647481
units:

The data in the logile on the remote zone frank where it is in a tiered storage:

{"log_category":"legacy","log_level":"info","log_message":"writeLine: inString = KeyValue[17]:create_mode=0;dataIncluded=;dataSize=231;dataType=generic;data_size=231;destRescName=eudatCache;num_threads=0;obj_path=/frank/home/robertv#igor/test_20231122_10.txt;offset=0;openType=1;open_flags=578;opr_type=1;remoteZoneOpr=remoteCreate;resc_hier=eudatCache;selObjType=dataObj;selected_hierarchy=eudatCache;translatedPath=;\n","request_api_name":"DATA_OBJ_PUT_AN","request_api_number":606,"request_api_version":"d","request_client_user":"robertv","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.2.12","server_host":"irodstest1.storage.surfsara.nl","server_pid":17428,"server_timestamp":"2023-11-22T10:04:41.014Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"writeLine: inString = resource name = [eudatCache]\n","request_api_name":"DATA_OBJ_PUT_AN","request_api_number":606,"request_api_version":"d","request_client_user":"robertv","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.2.12","server_host":"irodstest1.storage.surfsara.nl","server_pid":17428,"server_timestamp":"2023-11-22T10:04:41.014Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"writeLine: inString = Entering pep_api_data_obj_put_post for /frank/home/robertv#igor/test_20231122_10.txt\n","request_api_name":"DATA_OBJ_PUT_AN","request_api_number":606,"request_api_version":"d","request_client_user":"robertv","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.2.12","server_host":"irodstest1.storage.surfsara.nl","server_pid":17428,"server_timestamp":"2023-11-22T10:04:41.014Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"use default query for [eudatCache]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17512,"server_timestamp":"2023-11-22T10:05:03.182Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"use default query for [eudatCache]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17732,"server_timestamp":"2023-11-22T10:06:34.509Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"use default query for [eudatCache]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17887,"server_timestamp":"2023-11-22T10:07:35.768Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"found 5 objects for resc [eudatCache] with query [SELECT DATA_NAME, COLL_NAME, USER_NAME, USER_ZONE, DATA_REPL_NUM WHERE META_DATA_ATTR_NAME = 'irods::access_time' AND META_DATA_ATTR_VALUE < '1700647535' AND META_DATA_ATTR_UNITS <> 'irods::storage_tiering::migration_scheduled' AND DATA_RESC_ID IN ('10002',)] type [0]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17887,"server_timestamp":"2023-11-22T10:07:35.774Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"irods::storage_tiering :: delay params for [eudatCache] - [<INST_NAME>irods_rule_engine_plugin-unified_storage_tiering-instance</INST_NAME><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF><PLUSET>6s</PLUSET>]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17887,"server_timestamp":"2023-11-22T10:07:36.066Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"executing as user [robertv] fom zone [igor]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17887,"server_timestamp":"2023-11-22T10:07:36.066Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"data movement scheduling failed - [-808000]::[iRODS Exception:\n    file: /home/robertv/git/irods_capability_storage_tiering/storage_tiering.cpp\n    function: std::string irods::storage_tiering::get_metadata_for_data_object(rcComm_t *, const std::string &, const std::string &)\n    line: 92\n    code: -808000 (CAT_NO_ROWS_FOUND)\n    message:\n        no results found for object [/frank/home/robertv#igor/test_20231122_10.txt] with attribute [irods::access_time]\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()::operator()() in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 4# boost::asio::detail::executor_op<boost::asio::detail::binder0<irods::query_processor<RcComm>::execute(irods::thread_pool&, RcComm&)::'lambda'()>, std::__1::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, boost::asio::detail::scheduler_operation*, boost::system::error_code const&, unsigned long) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 5# boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&) in /lib/libirods_server.so.4.3.1\n 6# boost::asio::detail::scheduler::run(boost::system::error_code&) in /lib/libirods_server.so.4.3.1\n 7# boost::asio::detail::posix_thread::func<boost::asio::thread_pool::thread_function>::run() in /lib/libirods_server.so.4.3.1\n 8# boost_asio_detail_posix_thread_function in /lib/libirods_server.so.4.3.1\n 9# 0x00007FC191624EA5 in /lib64/libpthread.so.0\n10# clone in /lib64/libc.so.6\n\n]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17887,"server_timestamp":"2023-11-22T10:07:36.254Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"iRODS Exception:\n    file: /home/robertv/git/irods_capability_storage_tiering/storage_tiering.cpp\n    function: void irods::storage_tiering::migrate_violating_data_objects(rcComm_t *, const std::string &, const std::string &, const std::string &, const std::string &)\n    line: 616\n    code: -35000 (SYS_INVALID_OPR_TYPE)\n    message:\n        scheduling failed for [1] objects for query [SELECT DATA_NAME, COLL_NAME, USER_NAME, USER_ZONE, DATA_REPL_NUM WHERE META_DATA_ATTR_NAME = 'irods::access_time' AND META_DATA_ATTR_VALUE < '1700647535' AND META_DATA_ATTR_UNITS <> 'irods::storage_tiering::migration_scheduled' AND DATA_RESC_ID IN ('10002',)]\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# irods::log(irods::exception const&) in /lib/libirods_common.so.4.3.1\n 4# irods::storage_tiering::migrate_violating_data_objects(RcComm*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 5# irods::storage_tiering::apply_policy_for_tier_group(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 6# exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 7# std::__1::__function::__func<irods::error (*)(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback), std::__1::allocator<irods::error (*)(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)>, irods::error (std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback)>::operator()(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*&&, irods::callback&&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 8# irods::pluggable_rule_engine<std::__1::tuple<> >::exec_rule_expression(std::__1::tuple<>&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*, irods::callback) in /lib/libirods_server.so.4.3.1\n 9# irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>::exec_rule_expression(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, MsParamArray*) in /lib/libirods_server.so.4.3.1\n10# rsExecRuleExpression(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n11# irods::api_call_adaptor<ExecRuleExpression*>::operator()(irods::plugin_context&, RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n12# std::__1::__function::__func<irods::api_call_adaptor<ExecRuleExpression*>, std::__1::allocator<irods::api_call_adaptor<ExecRuleExpression*> >, irods::error (irods::plugin_context&, RsComm*, ExecRuleExpression*)>::operator()(irods::plugin_context&, RsComm*&&, ExecRuleExpression*&&) in /lib/libirods_server.so.4.3.1\n13# int irods::api_entry::call_handler<ExecRuleExpression*>(RsComm*, ExecRuleExpression*) in /lib/libirods_server.so.4.3.1\n14# rsApiHandler(RsComm*, int, BytesBuf*, BytesBuf*) in /lib/libirods_server.so.4.3.1\n15# readAndProcClientMsg(RsComm*, int) in /lib/libirods_server.so.4.3.1\n16# agentMain(RsComm*) in /lib/libirods_server.so.4.3.1\n17# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n18# main::$_5::operator()() const in /usr/sbin/irodsServer\n19# main in /usr/sbin/irodsServer\n20# __libc_start_main in /lib64/libc.so.6\n21# _start in /usr/sbin/irodsServer\n\n","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":17887,"server_timestamp":"2023-11-22T10:07:36.489Z","server_type":"agent","server_zone":"frank"}
cookie33 commented 8 months ago

We finally got it to replicate data which was ingested from a remote zone with the latest commit. Now to update the automated tests to incorporate the data from a remote zone user as a last step.

cookie33 commented 8 months ago

Example logging output:

{"log_category":"legacy","log_level":"info","log_message":"writeLine: inString = KeyValue[17]:create_mode=0;dataIncluded=;dataSize=231;dataType=generic;data_size=231;destRescName=eudatCache;num_threads=0;obj_path=/frank/home/robertv#igor/test_20231124_02.txt;offset=0;openType=1;open_flags=578;opr_type=1;remoteZoneOpr=remoteCreate;resc_hier=eudatCache;selObjType=dataObj;selected_hierarchy=eudatCache;translatedPath=;\n","request_api_name":"DATA_OBJ_PUT_AN","request_api_number":606,"request_api_version":"d","request_client_user":"robertv","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.2.12","server_host":"irodstest1.storage.surfsara.nl","server_pid":7131,"server_timestamp":"2023-11-24T14:12:21.816Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"writeLine: inString = resource name = [eudatCache]\n","request_api_name":"DATA_OBJ_PUT_AN","request_api_number":606,"request_api_version":"d","request_client_user":"robertv","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.2.12","server_host":"irodstest1.storage.surfsara.nl","server_pid":7131,"server_timestamp":"2023-11-24T14:12:21.816Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"writeLine: inString = Entering pep_api_data_obj_put_post for /frank/home/robertv#igor/test_20231124_02.txt\n","request_api_name":"DATA_OBJ_PUT_AN","request_api_number":606,"request_api_version":"d","request_client_user":"robertv","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.2.12","server_host":"irodstest1.storage.surfsara.nl","server_pid":7131,"server_timestamp":"2023-11-24T14:12:21.816Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"use default query for [eudatCache]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7272,"server_timestamp":"2023-11-24T14:13:13.631Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"use default query for [eudatCache]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7465,"server_timestamp":"2023-11-24T14:14:15.164Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"use default query for [eudatCache]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7641,"server_timestamp":"2023-11-24T14:15:16.346Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"found 5 objects for resc [eudatCache] with query [SELECT DATA_NAME, COLL_NAME, USER_NAME, USER_ZONE, DATA_REPL_NUM WHERE META_DATA_ATTR_NAME = 'irods::access_time' AND META_DATA_ATTR_VALUE < '1700835196' AND META_DATA_ATTR_UNITS <> 'irods::storage_tiering::migration_scheduled' AND DATA_RESC_ID IN ('10002',)] type [0]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7641,"server_timestamp":"2023-11-24T14:15:16.352Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"irods::storage_tiering :: delay params for [eudatCache] - [<INST_NAME>irods_rule_engine_plugin-unified_storage_tiering-instance</INST_NAME><EF>1h DOUBLE UNTIL SUCCESS OR 6 TIMES</EF><PLUSET>3s</PLUSET>]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7641,"server_timestamp":"2023-11-24T14:15:16.638Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"executing as user [robertv] fom zone [igor]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7641,"server_timestamp":"2023-11-24T14:15:16.639Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"executing as user [robertv] fom zone [igor]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7641,"server_timestamp":"2023-11-24T14:15:16.644Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"irods::storage_tiering migrating [/frank/home/robertv#igor/test_20231124_02.txt] from [eudatCache] to [eudatPnfs]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7641,"server_timestamp":"2023-11-24T14:15:16.653Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"executing as user [robertv] fom zone [igor]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7731,"server_timestamp":"2023-11-24T14:15:46.893Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"verify_replica_for_destination_resource - [filesystem] [/frank/home/robertv#igor/test_20231124_02.txt] [eudatCache] [eudatPnfs]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7731,"server_timestamp":"2023-11-24T14:15:46.964Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"verify_replica_for_destination_resource - source attributes: [/data/eudatCache/Vault/home/robertv#igor/test_20231124_02.txt] [231] [eudatCache] []","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7731,"server_timestamp":"2023-11-24T14:15:46.967Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"verify_replica_for_destination_resource - destination attributes: [/data/eudatPnfs/home/robertv#igor/test_20231124_02.txt] [231] [eudatPnfs] []","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7731,"server_timestamp":"2023-11-24T14:15:46.969Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"info","log_message":"verify_replica_for_destination_resource - verify filesystem: 1 - 231 vs 231","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7731,"server_timestamp":"2023-11-24T14:15:46.970Z","server_type":"agent","server_zone":"frank"}
{"log_category":"legacy","log_level":"error","log_message":"executing as user [robertv] fom zone [igor]","request_api_name":"EXEC_RULE_EXPRESSION_AN","request_api_number":1206,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.238","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest1.storage.surfsara.nl","server_pid":7731,"server_timestamp":"2023-11-24T14:15:46.981Z","server_type":"agent","server_zone":"frank"}
cookie33 commented 8 months ago

Could you test it with the automated tests? And give me a pointer how to add a test with a user from a remote zone? or can you add it?

trel commented 8 months ago

yes, we will test this / add a test. thank you!

cookie33 commented 8 months ago

The following lines need to be adapted in the tests:

grep -Ri user_name *
test_plugin_unified_storage_tiering.py:            admin_session.assert_icommand('''imeta set -R rnd1 irods::storage_tiering::query "SELECT DATA_NAME, COLL_NAME, USER_NAME, DATA_REPL_NUM where RESC_NAME = 'ufs2' || = 'ufs3' and META_DATA_ATTR_NAME = 'irods::access_time' and META_DATA_ATTR_VALUE < 'TIME_CHECK_STRING'"''')
test_plugin_unified_storage_tiering.py:            admin_session.assert_icommand('''imeta set -R rnd1 irods::storage_tiering::query "SELECT DATA_NAME, COLL_NAME, USER_NAME, DATA_REPL_NUM  where RESC_NAME = 'ufs2' || = 'ufs3' and META_DATA_ATTR_NAME = 'irods::access_time' and META_DATA_ATTR_VALUE < 'TIME_CHECK_STRING'"''')
test_plugin_unified_storage_tiering.py:            admin_session.assert_icommand('''imeta set -R ufs1g2 irods::storage_tiering::query "SELECT DATA_NAME, COLL_NAME, USER_NAME, DATA_REPL_NUM where RESC_NAME = 'ufs1g2' and META_DATA_ATTR_NAME = 'irods::access_time' and META_DATA_ATTR_VALUE < 'TIME_CHECK_STRING'"''')
test_plugin_unified_storage_tiering.py:            admin_session.assert_icommand('''imeta set -R rnd1 irods::custom_storage_tiering::query "SELECT DATA_NAME, COLL_NAME, USER_NAME, DATA_REPL_NUM where RESC_NAME = 'ufs2' || = 'ufs3' and META_DATA_ATTR_NAME = 'irods::custom_access_time' and META_DATA_ATTR_VALUE < 'TIME_CHECK_STRING'"''')
test_plugin_unified_storage_tiering.py:            admin_session.assert_icommand('''imeta add -R ufs0 irods::storage_tiering::query "SELECT DATA_NAME, COLL_NAME, USER_NAME, DATA_REPL_NUM where RESC_NAME = 'ufs0' and META_DATA_ATTR_NAME = 'irods::access_time' and META_DATA_ATTR_VALUE < 'TIME_CHECK_STRING'"''')

They need to include the USER_ZONE. As described in the updated README.md

cookie33 commented 8 months ago

The tests have been updated with the appropriate extra value to show during an SQL query.

cookie33 commented 8 months ago

The clang formatting seems to be off:

git-clang-format --style=file --diff 67a218520fdc01d07a254a5b4f21aa674a2546ae
diff --git a/exec_as_user.hpp b/exec_as_user.hpp
index 15991ce..d0a9fa4 100644
--- a/exec_as_user.hpp
+++ b/exec_as_user.hpp
@@ -5,9 +5,9 @@
 #include <irods/irods_at_scope_exit.hpp>

 namespace irods {
-    template <typename Function>
-    int exec_as_user(rcComm_t* _comm, const std::string& _user_name, const std::string& _user_zone, Function _func)
-    {
+       template <typename Function>
+       int exec_as_user(rcComm_t* _comm, const std::string& _user_name, const std::string& _user_zone, Function _func)
+       {
         auto& user = _comm->clientUser;

         // need to be able to have a rodsuser/rodsuser 'switch hats'
@@ -16,23 +16,19 @@ namespace irods {
         //}

         const std::string old_user_name = user.userName;
-        const std::string old_user_zone = user.rodsZone;
+               const std::string old_user_zone = user.rodsZone;

-        rstrcpy(user.userName, _user_name.data(), NAME_LEN);
-        rstrcpy(user.rodsZone, _user_zone.data(), NAME_LEN);
+               rstrcpy(user.userName, _user_name.data(), NAME_LEN);
+               rstrcpy(user.rodsZone, _user_zone.data(), NAME_LEN);

-        rodsLog(
-            LOG_DEBUG,
-                "Executing as user [%s] fom zone [%s]",
-                user.userName,
-                user.rodsZone);
+               rodsLog(LOG_DEBUG, "Executing as user [%s] fom zone [%s]", user.userName, user.rodsZone);

-        irods::at_scope_exit<std::function<void()>> at_scope_exit{[&user, &old_user_name, &old_user_zone] {
-            rstrcpy(user.userName, old_user_name.c_str(), MAX_NAME_LEN);
-            rstrcpy(user.rodsZone, old_user_zone.c_str(), MAX_NAME_LEN);
-        }};
+               irods::at_scope_exit<std::function<void()>> at_scope_exit{[&user, &old_user_name, &old_user_zone] {
+                       rstrcpy(user.userName, old_user_name.c_str(), MAX_NAME_LEN);
+                       rstrcpy(user.rodsZone, old_user_zone.c_str(), MAX_NAME_LEN);
+               }};

-        return _func(_comm);
+               return _func(_comm);
     } // exec_as_user

 } // namespace irods
diff --git a/libirods_rule_engine_plugin-unified_storage_tiering.cpp b/libirods_rule_engine_plugin-unified_storage_tiering.cpp
index 0b1d8a2..c6001f3 100644
--- a/libirods_rule_engine_plugin-unified_storage_tiering.cpp
+++ b/libirods_rule_engine_plugin-unified_storage_tiering.cpp
@@ -346,19 +346,18 @@ namespace {
         }
     } // apply_access_time_policy

-    int apply_data_movement_policy(
-        rcComm_t*          _comm,
-        const std::string& _instance_name,
-        const std::string& _object_path,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _source_replica_number,
-        const std::string& _source_resource,
-        const std::string& _destination_resource,
-        const bool         _preserve_replicas,
-        const std::string& _verification_type) {
-
-        replicate_object_to_resource(
+       int apply_data_movement_policy(rcComm_t* _comm,
+                                      const std::string& _instance_name,
+                                      const std::string& _object_path,
+                                      const std::string& _user_name,
+                                      const std::string& _user_zone,
+                                      const std::string& _source_replica_number,
+                                      const std::string& _source_resource,
+                                      const std::string& _destination_resource,
+                                      const bool _preserve_replicas,
+                                      const std::string& _verification_type)
+       {
+               replicate_object_to_resource(
             _comm,
             _instance_name,
             _source_resource,
@@ -388,9 +387,9 @@ namespace {
                 _preserve_replicas);

         return 0;
-    } // apply_data_movement_policy
+       } // apply_data_movement_policy

-    void apply_restage_movement_policy(
+       void apply_restage_movement_policy(
         const std::string &    _rn,
         ruleExecInfo_t*        _rei,
         std::list<boost::any>& _args) {
@@ -421,16 +420,13 @@ namespace {
                 parser.first_resc(source_resource);

                 auto proxy_conn = irods::proxy_connection();
-                rcComm_t* comm = proxy_conn.make(_rei->rsComm->clientUser.userName, _rei->rsComm->clientUser.rodsZone);
+                               rcComm_t* comm = proxy_conn.make(_rei->rsComm->clientUser.userName, _rei->rsComm->clientUser.rodsZone);

-                irods::storage_tiering st{comm, _rei, plugin_instance_name};
+                               irods::storage_tiering st{comm, _rei, plugin_instance_name};

-                st.migrate_object_to_minimum_restage_tier(
-                    object_path,
-                    _rei->rsComm->clientUser.userName,
-                    _rei->rsComm->clientUser.rodsZone,
-                    source_resource);
-            }
+                               st.migrate_object_to_minimum_restage_tier(
+                                       object_path, _rei->rsComm->clientUser.userName, _rei->rsComm->clientUser.rodsZone, source_resource);
+                       }
             else if("pep_api_data_obj_open_post"   == _rn ||
                     "pep_api_data_obj_create_post" == _rn) {
                 auto it = _args.begin();
@@ -470,15 +466,15 @@ namespace {
                     auto [object_path, resource_name] = opened_objects[l1_idx];

                     auto proxy_conn = irods::proxy_connection();
-                    rcComm_t* comm = proxy_conn.make(_rei->rsComm->clientUser.userName, _rei->rsComm->clientUser.rodsZone);
-
-                    irods::storage_tiering st{comm, _rei, plugin_instance_name};
-                    st.migrate_object_to_minimum_restage_tier(
-                        object_path,
-                        _rei->rsComm->clientUser.userName,
-                        _rei->rsComm->clientUser.rodsZone,
-                        resource_name);
-                }
+                                       rcComm_t* comm =
+                                               proxy_conn.make(_rei->rsComm->clientUser.userName, _rei->rsComm->clientUser.rodsZone);
+
+                                       irods::storage_tiering st{comm, _rei, plugin_instance_name};
+                                       st.migrate_object_to_minimum_restage_tier(object_path,
+                                                                                 _rei->rsComm->clientUser.userName,
+                                                                                 _rei->rsComm->clientUser.rodsZone,
+                                                                                 resource_name);
+                               }
             }
         }
         catch(const boost::bad_any_cast& _e) {
@@ -490,26 +486,24 @@ namespace {
         }
     } // apply_restage_movement_policy

-    int apply_tier_group_metadata_policy(
-        irods::storage_tiering& _st,
-        const std::string& _group_name,
-        const std::string& _object_path,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _source_replica_number,
-        const std::string& _source_resource,
-        const std::string& _destination_resource) {
-        _st.apply_tier_group_metadata_to_object(
-            _group_name,
-            _object_path,
-            _user_name,
-            _user_zone,
-            _source_replica_number,
-            _source_resource,
-            _destination_resource);
-        return 0;
-    } // apply_tier_group_metadata_policy
-
+       int apply_tier_group_metadata_policy(irods::storage_tiering& _st,
+                                            const std::string& _group_name,
+                                            const std::string& _object_path,
+                                            const std::string& _user_name,
+                                            const std::string& _user_zone,
+                                            const std::string& _source_replica_number,
+                                            const std::string& _source_resource,
+                                            const std::string& _destination_resource)
+       {
+               _st.apply_tier_group_metadata_to_object(_group_name,
+                                                       _object_path,
+                                                       _user_name,
+                                                       _user_zone,
+                                                       _source_replica_number,
+                                                       _source_resource,
+                                                       _destination_resource);
+               return 0;
+       } // apply_tier_group_metadata_policy

 } // namespace

@@ -719,41 +713,39 @@ irods::error exec_rule_expression(
                 irods::storage_tiering::policy::data_movement ==
                 rule_obj.at("rule-engine-operation")) {
             try {
-                // proxy for provided user name and zone
-                const std::string& user_name = rule_obj["user-name"];
-                const std::string& user_zone = rule_obj["user-zone"];
-                auto& pin = plugin_instance_name;
+                               // proxy for provided user name and zone
+                               const std::string& user_name = rule_obj["user-name"];
+                               const std::string& user_zone = rule_obj["user-zone"];
+                               auto& pin = plugin_instance_name;

                 auto proxy_conn = irods::proxy_connection();
-                rcComm_t* comm = proxy_conn.make( rule_obj["user-name"], rule_obj["user-zone"]);
-
-                auto status = irods::exec_as_user(comm, user_name, user_zone, [& pin, & rule_obj](auto& comm) -> int{
-                                    return apply_data_movement_policy(
-                                        comm,
-                                        plugin_instance_name,
-                                        rule_obj["object-path"],
-                                        rule_obj["user-name"],
-                                        rule_obj["user-zone"],
-                                        rule_obj["source-replica-number"],
-                                        rule_obj["source-resource"],
-                                        rule_obj["destination-resource"],
-                                        rule_obj["preserve-replicas"],
-                                        rule_obj["verification-type"]);
-                                    });
-
-                irods::storage_tiering st{comm, rei, plugin_instance_name};
-                status = irods::exec_as_user(comm, user_name, user_zone, [& st, & rule_obj](auto& comm) -> int{
-                                    return apply_tier_group_metadata_policy(
-                                        st,
-                                        rule_obj["group-name"],
-                                        rule_obj["object-path"],
-                                        rule_obj["user-name"],
-                                        rule_obj["user-zone"],
-                                        rule_obj["source-replica-number"],
-                                        rule_obj["source-resource"],
-                                        rule_obj["destination-resource"]);
-                                    });
-            }
+                               rcComm_t* comm = proxy_conn.make(rule_obj["user-name"], rule_obj["user-zone"]);
+
+                               auto status = irods::exec_as_user(comm, user_name, user_zone, [&pin, &rule_obj](auto& comm) -> int {
+                                       return apply_data_movement_policy(comm,
+                                                                         plugin_instance_name,
+                                                                         rule_obj["object-path"],
+                                                                         rule_obj["user-name"],
+                                                                         rule_obj["user-zone"],
+                                                                         rule_obj["source-replica-number"],
+                                                                         rule_obj["source-resource"],
+                                                                         rule_obj["destination-resource"],
+                                                                         rule_obj["preserve-replicas"],
+                                                                         rule_obj["verification-type"]);
+                               });
+
+                               irods::storage_tiering st{comm, rei, plugin_instance_name};
+                               status = irods::exec_as_user(comm, user_name, user_zone, [&st, &rule_obj](auto& comm) -> int {
+                                       return apply_tier_group_metadata_policy(st,
+                                                                               rule_obj["group-name"],
+                                                                               rule_obj["object-path"],
+                                                                               rule_obj["user-name"],
+                                                                               rule_obj["user-zone"],
+                                                                               rule_obj["source-replica-number"],
+                                                                               rule_obj["source-resource"],
+                                                                               rule_obj["destination-resource"]);
+                               });
+                       }
             catch(const irods::exception& _e) {
                 printErrorStack(&rei->rsComm->rError);
                 return ERROR(
diff --git a/proxy_connection.hpp b/proxy_connection.hpp
index ae092a6..d482d29 100644
--- a/proxy_connection.hpp
+++ b/proxy_connection.hpp
@@ -9,26 +9,22 @@ namespace irods {
         rErrMsg_t err_msg;
         rcComm_t* conn;

-        auto make(const std::string clientUser = "", const std::string clientZone = "") -> rcComm_t*
-        {
+               auto make(const std::string clientUser = "", const std::string clientZone = "") -> rcComm_t*
+               {
             rodsEnv env{};
             _getRodsEnv(env);

-            conn = _rcConnect(
-                       env.rodsHost,
-                       env.rodsPort,
-                       env.rodsUserName,
-                       env.rodsZone,
-                       !clientUser.empty() ?
-                           clientUser.c_str() :
-                           env.rodsUserName,
-                       !clientZone.empty() ?
-                           clientZone.c_str() :
-                           env.rodsZone,
-                       &err_msg,
-                       0, 0);
-
-            clientLogin(conn);
+                       conn = _rcConnect(env.rodsHost,
+                                         env.rodsPort,
+                                         env.rodsUserName,
+                                         env.rodsZone,
+                                         !clientUser.empty() ? clientUser.c_str() : env.rodsUserName,
+                                         !clientZone.empty() ? clientZone.c_str() : env.rodsZone,
+                                         &err_msg,
+                                         0,
+                                         0);
+
+                       clientLogin(conn);

             return conn;
         } // make
diff --git a/storage_tiering.cpp b/storage_tiering.cpp
index fa2872e..35d2a2c 100644
--- a/storage_tiering.cpp
+++ b/storage_tiering.cpp
@@ -446,14 +446,13 @@ namespace irods {
         catch(const exception&) {
             const auto leaf_str = get_leaf_resources_string(_resource_name);
             metadata_results results;
-            results.push_back(
-                std::make_pair(boost::str(
-                boost::format("SELECT DATA_NAME, COLL_NAME, USER_NAME, USER_ZONE, DATA_REPL_NUM WHERE META_DATA_ATTR_NAME = '%s' AND META_DATA_ATTR_VALUE < '%s' AND META_DATA_ATTR
-                % config_.access_time_attribute
-                % tier_time
-                % config_.migration_scheduled_flag
-                % leaf_str), ""));
-            rodsLog(
+                       results.push_back(std::make_pair(
+                               boost::str(boost::format("SELECT DATA_NAME, COLL_NAME, USER_NAME, USER_ZONE, DATA_REPL_NUM WHERE "
+                                                    "META_DATA_ATTR_NAME = '%s' AND META_DATA_ATTR_VALUE < '%s' AND "
+                                                    "META_DATA_ATTR_UNITS <> '%s' AND DATA_RESC_ID IN (%s)") %
+                                      config_.access_time_attribute % tier_time % config_.migration_scheduled_flag % leaf_str),
+                               ""));
+                       rodsLog(
                 config_.data_transfer_log_level_value,
                 "use default query for [%s]",
                 _resource_name.c_str());
@@ -568,9 +567,9 @@ namespace irods {
                     object_is_processed[object_path] = 1;

                     auto proxy_conn = irods::proxy_connection();
-                    rcComm_t* comm = proxy_conn.make(_results[2], _results[3]);
+                                       rcComm_t* comm = proxy_conn.make(_results[2], _results[3]);

-                    if(preserve_replicas) {
+                                       if(preserve_replicas) {
                         if(skip_object_in_lower_tier(
                                comm,
                                object_path,
@@ -579,19 +578,18 @@ namespace irods {
                         }
                     }

-                    queue_data_movement(
-                        comm,
-                        config_.instance_name,
-                        _group_name,
-                        object_path,
-                        _results[2],
-                        _results[3],
-                        _results[4],
-                        _source_resource,
-                        _destination_resource,
-                        get_verification_for_resc(comm, _destination_resource),
-                        get_preserve_replicas_for_resc(comm, _source_resource),
-                        get_data_movement_parameters_for_resource(comm, _source_resource));
+                                       queue_data_movement(comm,
+                                                           config_.instance_name,
+                                                           _group_name,
+                                                           object_path,
+                                                           _results[2],
+                                                           _results[3],
+                                                           _results[4],
+                                                           _source_resource,
+                                                           _destination_resource,
+                                                           get_verification_for_resc(comm, _destination_resource),
+                                                           get_preserve_replicas_for_resc(comm, _source_resource),
+                                                           get_data_movement_parameters_for_resource(comm, _source_resource));

                 }; // job

@@ -658,47 +656,41 @@ namespace irods {

     } // schedule_storage_tiering_policy

-    void storage_tiering::queue_data_movement(
-        rcComm_t*          _comm,
-        const std::string& _plugin_instance_name,
-        const std::string& _group_name,
-        const std::string& _object_path,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _source_replica_number,
-        const std::string& _source_resource,
-        const std::string& _destination_resource,
-        const std::string& _verification_type,
-        const bool         _preserve_replicas,
-        const std::string& _data_movement_params) {
-        if(object_has_migration_metadata_flag(_comm, _user_name, _user_zone, _object_path)) {
-            return;
-        }
-
-        set_migration_metadata_flag_for_object(_comm, _user_name, _user_zone, _object_path);
-
-        nlohmann::json rule_obj =
-        {
-            {"policy_to_invoke", "irods_policy_enqueue_rule"}
-          , {"parameters",
-                {
-                    {"rule-engine-operation",     policy::data_movement}
-                  , {"rule-engine-instance-name", _plugin_instance_name}
-                  , {"group-name",                _group_name}
-                  , {"object-path",               _object_path}
-                  , {"user-name",                 _user_name}
-                  , {"user-zone",                 _user_zone}
-                  , {"source-replica-number",     _source_replica_number}
-                  , {"source-resource",           _source_resource}
-                  , {"destination-resource",      _destination_resource}
-                  , {"preserve-replicas",         _preserve_replicas}
-                  , {"verification-type",         _verification_type}
-                  , {"delay_conditions",          _data_movement_params}
-                }
-            }
-         };
-
-        execMyRuleInp_t exec_inp{};
+       void storage_tiering::queue_data_movement(rcComm_t* _comm,
+                                                 const std::string& _plugin_instance_name,
+                                                 const std::string& _group_name,
+                                                 const std::string& _object_path,
+                                                 const std::string& _user_name,
+                                                 const std::string& _user_zone,
+                                                 const std::string& _source_replica_number,
+                                                 const std::string& _source_resource,
+                                                 const std::string& _destination_resource,
+                                                 const std::string& _verification_type,
+                                                 const bool _preserve_replicas,
+                                                 const std::string& _data_movement_params)
+       {
+               if (object_has_migration_metadata_flag(_comm, _user_name, _user_zone, _object_path)) {
+                       return;
+               }
+
+               set_migration_metadata_flag_for_object(_comm, _user_name, _user_zone, _object_path);
+
+               nlohmann::json rule_obj = {{"policy_to_invoke", "irods_policy_enqueue_rule"},
+                                          {"parameters",
+                                           {{"rule-engine-operation", policy::data_movement},
+                                            {"rule-engine-instance-name", _plugin_instance_name},
+                                            {"group-name", _group_name},
+                                            {"object-path", _object_path},
+                                            {"user-name", _user_name},
+                                            {"user-zone", _user_zone},
+                                            {"source-replica-number", _source_replica_number},
+                                            {"source-resource", _source_resource},
+                                            {"destination-resource", _destination_resource},
+                                            {"preserve-replicas", _preserve_replicas},
+                                            {"verification-type", _verification_type},
+                                            {"delay_conditions", _data_movement_params}}}};
+
+               execMyRuleInp_t exec_inp{};
         rstrcpy(exec_inp.myRule, rule_obj.dump().c_str(), META_STR_LEN);
         msParamArray_t* out_arr{};
         addKeyVal(
@@ -724,9 +716,9 @@ namespace irods {
             _source_resource.c_str(),
             _destination_resource.c_str());

-    } // queue_data_movement
+       } // queue_data_movement

-    std::string storage_tiering::get_replica_number_for_resource(
+       std::string storage_tiering::get_replica_number_for_resource(
         rcComm_t*          _comm,
         const std::string& _object_path,
         const std::string& _resource_name) {
@@ -781,13 +773,12 @@ namespace irods {

     } // get_group_name_by_replica_number

-    void storage_tiering::migrate_object_to_minimum_restage_tier(
-        const std::string& _object_path,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _source_resource) {
-
-        try {
+       void storage_tiering::migrate_object_to_minimum_restage_tier(const std::string& _object_path,
+                                                                    const std::string& _user_name,
+                                                                    const std::string& _user_zone,
+                                                                    const std::string& _source_resource)
+       {
+               try {
             const auto source_replica_number = get_replica_number_for_resource(
                                                    comm_,
                                                    _object_path,
@@ -806,20 +797,19 @@ namespace irods {
                 return;
             }

-            queue_data_movement(
-                comm_,
-                config_.instance_name,
-                group_name,
-                _object_path,
-                _user_name,
-                _user_zone,
-                source_replica_number,
-                _source_resource,
-                low_tier_resource_name,
-                get_verification_for_resc(comm_, low_tier_resource_name),
-                false,
-                get_data_movement_parameters_for_resource(comm_, _source_resource));
-        }
+                       queue_data_movement(comm_,
+                                           config_.instance_name,
+                                           group_name,
+                                           _object_path,
+                                           _user_name,
+                                           _user_zone,
+                                           source_replica_number,
+                                           _source_resource,
+                                           low_tier_resource_name,
+                                           get_verification_for_resc(comm_, low_tier_resource_name),
+                                           false,
+                                           get_data_movement_parameters_for_resource(comm_, _source_resource));
+               }
         catch(const exception& _e) {
             rodsLog(
                 config_.data_transfer_log_level_value,
@@ -828,9 +818,9 @@ namespace irods {
                 _source_resource.c_str(),
                 _e.what());
         }
-    } // migrate_object_to_minimum_restage_tier
+       } // migrate_object_to_minimum_restage_tier

-    std::string storage_tiering::make_partial_list(
+       std::string storage_tiering::make_partial_list(
             resource_index_map::iterator _itr,
             resource_index_map::iterator _end) {
         ++_itr; // skip source resource
@@ -878,12 +868,12 @@ namespace irods {

     } // apply_policy_for_tier_group

-    void storage_tiering::set_migration_metadata_flag_for_object(
-        rcComm_t*          _comm,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _object_path) {
-        auto access_time = get_metadata_for_data_object(
+       void storage_tiering::set_migration_metadata_flag_for_object(rcComm_t* _comm,
+                                                                    const std::string& _user_name,
+                                                                    const std::string& _user_zone,
+                                                                    const std::string& _object_path)
+       {
+               auto access_time = get_metadata_for_data_object(
                                _comm,
                                config_.access_time_attribute,
                                _object_path);
@@ -896,23 +886,22 @@ namespace irods {
            const_cast<char*>(access_time.c_str()),
            const_cast<char*>(config_.migration_scheduled_flag.c_str())};

-        auto status = exec_as_user(_comm, _user_name, _user_zone, [&set_op](auto comm) -> int {
-                            return rcModAVUMetadata(comm, &set_op);
-                            });
-        if(status < 0) {
+               auto status = exec_as_user(
+                       _comm, _user_name, _user_zone, [&set_op](auto comm) -> int { return rcModAVUMetadata(comm, &set_op); });
+               if(status < 0) {
            THROW(
                status,
                boost::format("failed to set migration scheduled flag for [%s]")
                % _object_path);
         }
-    } // set_migration_metadata_flag_for_object
+       } // set_migration_metadata_flag_for_object

-    void storage_tiering::unset_migration_metadata_flag_for_object(
-        rcComm_t*          _comm,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _object_path) {
-        auto access_time = get_metadata_for_data_object(
+       void storage_tiering::unset_migration_metadata_flag_for_object(rcComm_t* _comm,
+                                                                      const std::string& _user_name,
+                                                                      const std::string& _user_zone,
+                                                                      const std::string& _object_path)
+       {
+               auto access_time = get_metadata_for_data_object(
                                _comm,
                                config_.access_time_attribute,
                                _object_path);
@@ -924,24 +913,23 @@ namespace irods {
            const_cast<char*>(access_time.c_str()),
            nullptr};

-        const auto status = exec_as_user(_comm, _user_name, _user_zone, [&set_op](auto comm) -> int {
-                           return rcModAVUMetadata(comm, &set_op);
-                           });
-        if(status < 0) {
+               const auto status = exec_as_user(
+                       _comm, _user_name, _user_zone, [&set_op](auto comm) -> int { return rcModAVUMetadata(comm, &set_op); });
+               if(status < 0) {
             THROW(
                 status,
                 boost::format("failed to unset migration scheduled flag for [%s]")
                 % _object_path);
         }

-    } // unset_migration_metadata_flag_for_object
+       } // unset_migration_metadata_flag_for_object

-    bool storage_tiering::object_has_migration_metadata_flag(
-        rcComm_t*          _comm,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _object_path) {
-        boost::filesystem::path p{_object_path};
+       bool storage_tiering::object_has_migration_metadata_flag(rcComm_t* _comm,
+                                                                const std::string& _user_name,
+                                                                const std::string& _user_zone,
+                                                                const std::string& _object_path)
+       {
+               boost::filesystem::path p{_object_path};
         std::string coll_name = p.parent_path().string();
         std::string data_name = p.filename().string();

@@ -953,26 +941,26 @@ namespace irods {
                 % data_name
                 % coll_name) };

-        const auto status = exec_as_user(_comm, _user_name, _user_zone, [& query_str](auto& _comm) -> int {
-                            query<rcComm_t> qobj{_comm, query_str, 1};
-                            return qobj.size();
-                           });
-
-        return status > 0;
-
-    } // object_has_migration_metadata_flag
-
-    void storage_tiering::apply_tier_group_metadata_to_object(
-        const std::string& _group_name,
-        const std::string& _object_path,
-        const std::string& _user_name,
-        const std::string& _user_zone,
-        const std::string& _source_replica_number,
-        const std::string& _source_resource,
-        const std::string& _destination_resource) {
-        try {
-            unset_migration_metadata_flag_for_object(comm_, _user_name, _user_zone, _object_path);
-        }
+               const auto status = exec_as_user(_comm, _user_name, _user_zone, [&query_str](auto& _comm) -> int {
+                       query<rcComm_t> qobj{_comm, query_str, 1};
+                       return qobj.size();
+               });
+
+               return status > 0;
+
+       } // object_has_migration_metadata_flag
+
+       void storage_tiering::apply_tier_group_metadata_to_object(const std::string& _group_name,
+                                                                 const std::string& _object_path,
+                                                                 const std::string& _user_name,
+                                                                 const std::string& _user_zone,
+                                                                 const std::string& _source_replica_number,
+                                                                 const std::string& _source_resource,
+                                                                 const std::string& _destination_resource)
+       {
+               try {
+                       unset_migration_metadata_flag_for_object(comm_, _user_name, _user_zone, _object_path);
+               }
         catch(const exception&) {
         }

@@ -1005,7 +993,7 @@ namespace irods {
                 _e.what());
         }

-    } // apply_tier_group_metadata_to_object
+       } // apply_tier_group_metadata_to_object

 }; // namespace irods

diff --git a/storage_tiering.hpp b/storage_tiering.hpp
index 768a85f..6f1af1a 100644
--- a/storage_tiering.hpp
+++ b/storage_tiering.hpp
@@ -33,46 +33,40 @@ namespace irods {
         void apply_policy_for_tier_group(
             const std::string& _group);

-        void migrate_object_to_minimum_restage_tier(
-                 const std::string& _object_path,
-                 const std::string& _user_name,
-                 const std::string& _user_zone,
-                 const std::string& _source_resource);
+               void migrate_object_to_minimum_restage_tier(const std::string& _object_path,
+                                                           const std::string& _user_name,
+                                                           const std::string& _user_zone,
+                                                           const std::string& _source_resource);

-        void schedule_storage_tiering_policy(
+               void schedule_storage_tiering_policy(
             const std::string& _json,
             const std::string& _params);

-        void apply_tier_group_metadata_to_object(
-            const std::string& _group_name,
-            const std::string& _object_path,
-            const std::string& _user_name,
-            const std::string& _user_zone,
-            const std::string& _source_replica_number,
-            const std::string& _source_resource,
-            const std::string& _destination_resource);
+               void apply_tier_group_metadata_to_object(const std::string& _group_name,
+                                                        const std::string& _object_path,
+                                                        const std::string& _user_name,
+                                                        const std::string& _user_zone,
+                                                        const std::string& _source_replica_number,
+                                                        const std::string& _source_resource,
+                                                        const std::string& _destination_resource);

-        private:
+         private:
+               void set_migration_metadata_flag_for_object(rcComm_t* _comm,
+                                                           const std::string& _user_name,
+                                                           const std::string& _user_zone,
+                                                           const std::string& _object_path);

-        void set_migration_metadata_flag_for_object(
-            rcComm_t*          _comm,
-            const std::string& _user_name,
-            const std::string& _user_zone,
-            const std::string& _object_path);
-
-        void unset_migration_metadata_flag_for_object(
-            rcComm_t*          _comm,
-            const std::string& _user_name,
-            const std::string& _user_zone,
-            const std::string& _object_path);
+               void unset_migration_metadata_flag_for_object(rcComm_t* _comm,
+                                                             const std::string& _user_name,
+                                                             const std::string& _user_zone,
+                                                             const std::string& _object_path);

-        bool object_has_migration_metadata_flag(
-            rcComm_t*          _comm,
-            const std::string& _user_name,
-            const std::string& _user_zone,
-            const std::string& _object_path);
+               bool object_has_migration_metadata_flag(rcComm_t* _comm,
+                                                       const std::string& _user_name,
+                                                       const std::string& _user_zone,
+                                                       const std::string& _object_path);

-        bool skip_object_in_lower_tier(
+               bool skip_object_in_lower_tier(
             rcComm_t*          _comm,
             const std::string& _object_path,
             const std::string& _partial_list);
@@ -153,21 +147,20 @@ namespace irods {
             rcComm_t*          _comm,
             const std::string& _resource_name);

-        void queue_data_movement(
-            rcComm_t*          _comm,
-            const std::string& _plugin_instance_name,
-            const std::string& _group_name,
-            const std::string& _object_path,
-            const std::string& _user_name,
-            const std::string& _user_zone,
-            const std::string& _source_replica_number,
-            const std::string& _source_resource,
-            const std::string& _destination_resource,
-            const std::string& _verification_type,
-            const bool         _preserve_replicas,
-            const std::string& _data_movement_params);
-
-        void migrate_violating_data_objects(
+               void queue_data_movement(rcComm_t* _comm,
+                                        const std::string& _plugin_instance_name,
+                                        const std::string& _group_name,
+                                        const std::string& _object_path,
+                                        const std::string& _user_name,
+                                        const std::string& _user_zone,
+                                        const std::string& _source_replica_number,
+                                        const std::string& _source_resource,
+                                        const std::string& _destination_resource,
+                                        const std::string& _verification_type,
+                                        const bool _preserve_replicas,
+                                        const std::string& _data_movement_params);
+
+               void migrate_violating_data_objects(
             rcComm_t*           _comm,
             const std::string&  _group_name,
             const std::string&  _partial_list,

What to do?

trel commented 8 months ago

Excellent - thanks!

I think a full formatting sweep hasn't been done on this project, yet... @korydraughn? I'd say leave the clang-formatting for a separate PR.

korydraughn commented 8 months ago

Right. Ignore clang-format for now.

alanking commented 8 months ago

With these changes, I saw one test failure... test_plugin_unified_storage_tiering.TestStorageTieringMultipleQueries.test_put_and_get

We just need to adjust the specific query being used to include the zone name:

diff --git a/packaging/test_plugin_unified_storage_tiering.py b/packaging/test_plugin_unified_storage_tiering.py
index cb09014..0acb6c3 100644
--- a/packaging/test_plugin_unified_storage_tiering.py
+++ b/packaging/test_plugin_unified_storage_tiering.py
@@ -929,7 +929,7 @@ class TestStorageTieringMultipleQueries(ResourceBase, unittest.TestCase):
         super(TestStorageTieringMultipleQueries, self).setUp()
         with session.make_session_for_existing_admin() as admin_session:
             admin_session.assert_icommand('iqdel -a')
-            admin_session.assert_icommand('''iadmin asq "select distinct R_DATA_MAIN.data_name, R_COLL_MAIN.coll_name, R_DATA_MAIN.data_owner_name, R_DATA_MAIN.data_repl_num from R_DATA_MAIN, R_COLL_MAIN, R_RESC_MAIN, R_OBJT_METAMAP r_data_metamap, R_META_MAIN r_data_meta_main where R_RESC_MAIN.resc_name = 'ufs0' AND r_data_meta_main.meta_attr_name = 'archive_object' AND r_data_meta_main.meta_attr_value = 'yes' AND R_COLL_MAIN.coll_id = R_DATA_MAIN.coll_id AND R_RESC_MAIN.resc_id = R_DATA_MAIN.resc_id AND R_DATA_MAIN.data_id = r_data_metamap.object_id AND r_data_metamap.meta_id = r_data_meta_main.meta_id order by R_COLL_MAIN.coll_name, R_DATA_MAIN.data_name" archive_query''')
+            admin_session.assert_icommand('''iadmin asq "select distinct R_DATA_MAIN.data_name, R_COLL_MAIN.coll_name, R_DATA_MAIN.data_owner_name, R_DATA_MAIN.data_owner_zone, R_DATA_MAIN.data_repl_num from R_DATA_MAIN, R_COLL_MAIN, R_RESC_MAIN, R_OBJT_METAMAP r_data_metamap, R_META_MAIN r_data_meta_main where R_RESC_MAIN.resc_name = 'ufs0' AND r_data_meta_main.meta_attr_name = 'archive_object' AND r_data_meta_main.meta_attr_value = 'yes' AND R_COLL_MAIN.coll_id = R_DATA_MAIN.coll_id AND R_RESC_MAIN.resc_id = R_DATA_MAIN.resc_id AND R_DATA_MAIN.data_id = r_data_metamap.object_id AND r_data_metamap.meta_id = r_data_meta_main.meta_id order by R_COLL_MAIN.coll_name, R_DATA_MAIN.data_name" archive_query''')

             admin_session.assert_icommand('iadmin mkresc ufs0 unixfilesystem '+test.settings.HOSTNAME_1 +':/tmp/irods/ufs0', 'STDOUT_SINGLELINE', 'unixfilesystem')
             admin_session.assert_icommand('iadmin mkresc ufs1 unixfilesystem '+test.settings.HOSTNAME_1 +':/tmp/irods/ufs1', 'STDOUT_SINGLELINE', 'unixfilesystem')
alanking commented 8 months ago

Confirmed via manual testing that a data object owned by a remote user tiered out appropriately with no problems.

Once we fix up that one automated test, I think this should be good to go (barring any other review comments). I didn't see any other concerning things in the changes.

Good work, @cookie33!

trel commented 8 months ago

Excellent.

alanking commented 8 months ago

Authorship is indeed being correctly attributed despite what it looks like. :) Merging. Thanks, all.