irods / irods_capability_storage_tiering

BSD 3-Clause "New" or "Revised" License
5 stars 10 forks source link

error in irods logfile when retrieving a file which is in two storage tiers #232

Closed cookie33 closed 9 months ago

cookie33 commented 10 months ago

BUG

VERSIONS

iRODS 4.3.1

Expected BEHAVIOUR

When getting a file which is in multiple tiers from a tiered storage no errors are in the iRODS logfiles

Observed behaviour

The action causes errors in the logfiles, but data is retrieved.

Replication steps

We have two resources:

rods$ ilsresc -l eudatPnfs resource name: eudatPnfs id: 10003 zone: igor type: unixfilesystem location: irodstest2.storage.surfsara.nl vault: /data/eudatPnfs free space: free space time: : Never status: info: comment: create time: 01541156260: 2018-11-02.11:57:40 modify time: 01688043710: 2023-06-29.15:01:50 context: parent: parent context:

The meta data of the resources are as follows:

rods$ imeta ls -R eudatCache AVUs defined for resource eudatCache: attribute: irods::storage_tiering::group value: eudat units: 0

attribute: irods::storage_tiering::preserve_replicas value: true units:

attribute: irods::storage_tiering::time value: 120 units:

attribute: irods::storage_tiering::verification value: filesystem units:

rods$ imeta ls -R eudatPnfs AVUs defined for resource eudatPnfs: attribute: irods::storage_tiering::group value: eudat units: 1

attribute: irods::storage_tiering::verification value: filesystem units:


we put a file in there. The file is present in two resources after a while (no errors):

rods$ iput -f /var/log/irods/irods.log test_20231208_60.txt -R eudatCache

rods$ ils -l test_20231208_60.txt rods 0 eudatCache 1931831 2023-12-08.10:33 & test_20231208_60.txt rods 1 eudatPnfs 1931831 2023-12-08.10:36 & test_20231208_60.txt

rods$ imeta ls -d test_20231208_60.txt AVUs defined for dataObj /igor/home/rods/test_20231208_60.txt: attribute: irods::access_time value: 1702028217 units:

attribute: irods::storage_tiering::group value: eudat units: 1


We now try to retrieve a file from the tiered storage

rods$ date ; iget test_20231208_60.txt /tmp/test_retrieve.txt -f Fri Dec 8 10:43:25 CET 2023

rods$ ls -l /tmp/test_retrieve.txt -rw-r----- 1 rods rods 1931831 Dec 8 10:43 /tmp/test_retrieve.txt

No error.

But in the irods logfile it shows:

{"log_category":"legacy","log_level":"info","log_message":"Failed to restage data object [/igor/home/rods/test_20231208_60.txt] for resource [eudatCache] Exception: [iRODS Exception:\n file: /home/robertv/git/irods_capability_storage_tiering/storage_tiering.cpp\n function: std::string irods::storage_tiering::get_group_name_by_replica_number(rcComm_t , const std::string &, const std::string &, const std::string &)\n line: 777\n code: -808000 (CAT_NO_ROWS_FOUND)\n message:\n failed to fetch group name by resource and replica number\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# irods::storage_tiering::migrate_object_to_minimum_restage_tier(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 4# exec_rule(std::1::tuple<>&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 5# std::1::function::func<irods::error ()(std::1::tuple<>&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback), std::1::allocator<irods::error (*)(std::1::tuple<>&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback)>, irods::error (std::1::tuple<>&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback)>::operator()(std::1::tuple<>&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback&&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 6# irods::error irods::pluggable_rule_engine<std::1::tuple<> >::exec_rule<std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut*, BytesBuf>(std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::tuple<>&, std::__1::basic_string<char, std::1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&, irods::callback) in /lib/libirods_server.so.4.3.1\n 7# std::1::function::func<irods::error irods::rule_engine_context_manager<std::1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut*, BytesBuf>(std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&)::operator()(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut&&, BytesBuf&&) const::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), std::__1::allocator<irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::operator()(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) const::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>, irods::error (std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>::operator()(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) in /lib/libirods_server.so.4.3.1\n 8# irods::error irods::dynamic_operation_execution_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)1>::call<std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::__1::basic_string<char, std::1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&)>, std::__1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::__1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut*&&, BytesBuf&&)::'lambda'()::operator()() const in /lib/libirods_server.so.4.3.1\n 9# irods::error irods::dynamic_operation_execution_manager<std::1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)1>::call<std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut&&, BytesBuf&&)>, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::__1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) in /lib/libirods_server.so.4.3.1\n10# irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::operator()(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) const in /lib/libirods_server.so.4.3.1\n11# irods::error irods::control<irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), std::1::tuple<>, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp, portalOprOut, BytesBuf>(std::1::list<irods::re_pack_inp<std::1::tuple<> >, std::1::allocator<irods::re_pack_inp<std::1::tuple<> > > >&, irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) in /lib/libirods_server.so.4.3.1\n12# irods::error irods::api_entry::invoke_policy_enforcement_point<DataObjInp, portalOprOut*, BytesBuf>(irods::rule_engine_context_manager<std::1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>, irods::plugin_context&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator > const&, DataObjInp*, portalOprOut, BytesBuf) in /lib/libirods_server.so.4.3.1\n13# int irods::api_entry::call_handler<DataObjInp, portalOprOut*, BytesBuf>(RsComm, DataObjInp, portalOprOut*, BytesBuf) in /lib/libirods_server.so.4.3.1\n14# rsApiHandler(RsComm, int, BytesBuf, BytesBuf) in /lib/libirods_server.so.4.3.1\n15# readAndProcClientMsg(RsComm, int) in /lib/libirods_server.so.4.3.1\n16# agentMain(RsComm*) in /lib/libirods_server.so.4.3.1\n17# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n18# main::$_5::operator()() const in /usr/sbin/irodsServer\n19# main in /usr/sbin/irodsServer\n20# __libc_start_main in /lib64/libc.so.6\n21# _start in /usr/sbin/irodsServer\n\n]","request_api_name":"DATA_OBJ_GET_AN","request_api_number":608,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest2.storage.surfsara.nl","server_pid":23487,"server_timestamp":"2023-12-08T09:43:26.266Z","server_type":"agent","server_zone":"igor"}

{"log_category":"legacy","log_level":"info","log_message":"Failed to restage data object [/igor/home/rods/test_20231208_60.txt] for resource [eudatCache] Exception: [iRODS Exception:\n file: /irods_plugin_source/storage_tiering.cpp\n function: std::string irods::storage_tiering::get_group_name_by_replica_number(rcComm_t , const std::string &, const std::string &, const std::string &)\n line: 774\n code: -808000 (CAT_NO_ROWS_FOUND)\n message:\n failed to fetch group name by resource and replica number\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# irods::storage_tiering::migrate_object_to_minimum_restage_tier(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 4# exec_rule(std::1::tuple<>&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 5# std::1::function::func<irods::error ()(std::1::tuple<>&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback), std::1::allocator<irods::error (*)(std::1::tuple<>&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback)>, irods::error (std::1::tuple<>&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback)>::operator()(std::1::tuple<>&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::list<boost::any, std::1::allocator >&, irods::callback&&) in /usr/lib/irods/plugins/rule_engines/libirods_rule_engine_plugin-unified_storage_tiering.so\n 6# irods::error irods::pluggable_rule_engine<std::1::tuple<> >::exec_rule<std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut*, BytesBuf>(std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::tuple<>&, std::__1::basic_string<char, std::1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&, irods::callback) in /lib/libirods_server.so.4.3.1\n 7# std::1::function::func<irods::error irods::rule_engine_context_manager<std::1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut*, BytesBuf>(std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&)::operator()(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut&&, BytesBuf&&) const::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), std::__1::allocator<irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::operator()(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) const::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>, irods::error (std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>::operator()(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) in /lib/libirods_server.so.4.3.1\n 8# irods::error irods::dynamic_operation_execution_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)1>::call<std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::__1::basic_string<char, std::1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut*&&, BytesBuf&&)>, std::__1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::__1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut*&&, BytesBuf&&)::'lambda'()::operator()() const in /lib/libirods_server.so.4.3.1\n 9# irods::error irods::dynamic_operation_execution_manager<std::1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)1>::call<std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*&&, portalOprOut&&, BytesBuf&&)>, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::function<irods::error (std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::__1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)>, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) in /lib/libirods_server.so.4.3.1\n10# irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::operator()(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) const in /lib/libirods_server.so.4.3.1\n11# irods::error irods::control<irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), std::1::tuple<>, std::1::basic_string<char, std::__1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp, portalOprOut, BytesBuf>(std::1::list<irods::re_pack_inp<std::1::tuple<> >, std::1::allocator<irods::re_pack_inp<std::1::tuple<> > > >&, irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(irods::re_pack_inp<std::1::tuple<> >&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), irods::error irods::rule_engine_context_manager<std::__1::tuple<>, RuleExecInfo, (irods::rule_execution_manager_pack)0>::exec_rule<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp*, portalOprOut, BytesBuf>(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&)::'lambda'(std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&), std::1::basic_string<char, std::__1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::__1::char_traits, std::1::allocator >&, irods::plugin_context&, DataObjInp&&, portalOprOut&&, BytesBuf&&) in /lib/libirods_server.so.4.3.1\n12# irods::error irods::api_entry::invoke_policy_enforcement_point<DataObjInp, portalOprOut*, BytesBuf>(irods::rule_engine_context_manager<std::1::tuple<>, RuleExecInfo*, (irods::rule_execution_manager_pack)0>, irods::plugin_context&, std::1::basic_string<char, std::1::char_traits, std::1::allocator > const&, std::1::basic_string<char, std::1::char_traits, std::__1::allocator > const&, DataObjInp*, portalOprOut, BytesBuf) in /lib/libirods_server.so.4.3.1\n13# int irods::api_entry::call_handler<DataObjInp, portalOprOut*, BytesBuf>(RsComm, DataObjInp, portalOprOut*, BytesBuf) in /lib/libirods_server.so.4.3.1\n14# rsApiHandler(RsComm, int, BytesBuf, BytesBuf) in /lib/libirods_server.so.4.3.1\n15# readAndProcClientMsg(RsComm, int) in /lib/libirods_server.so.4.3.1\n16# agentMain(RsComm*) in /lib/libirods_server.so.4.3.1\n17# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n18# main::$_5::operator()() const in /usr/sbin/irodsServer\n19# main in /usr/sbin/irodsServer\n20# __libc_start_main in /lib64/libc.so.6\n21# _start in /usr/sbin/irodsServer\n\n]","request_api_name":"DATA_OBJ_GET_AN","request_api_number":608,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest2.storage.surfsara.nl","server_pid":27705,"server_timestamp":"2023-12-08T09:53:49.312Z","server_type":"agent","server_zone":"igor"}


This has been tested with:
* latest official release of irods-rule-engine-plugin-unified-storage-tiering.x86_64
* unreleased version of irods-rule-engine-plugin-unified-storage-tiering.x86_64 with the changes for remote users/zones.

If we remove the version which is stored in the cache it works without an error:

rods$ ils -l test_20231208_60.txt rods 0 eudatCache 1931831 2023-12-08.10:33 & test_20231208_60.txt rods 1 eudatPnfs 1931831 2023-12-08.10:36 & test_20231208_60.txt

rods$ itrim -N1 -n 0 test_20231208_60.txt Total size trimmed = 1.842 MB. Number of files trimmed = 1.

rods$ date ; iget test_20231208_60.txt /tmp/test_retrieve_3.txt -f Fri Dec 8 10:58:57 CET 2023

The logfile shows:

{"log_category":"legacy","log_level":"info","log_message":"irods::storage_tiering migrating [/igor/home/rods/test_20231208_60.txt] from [eudatPnfs] to [eudatCache]","request_api_name":"DATA_OBJ_GET_AN","request_api_number":608,"request_api_version":"d","request_client_user":"rods","request_host":"145.100.3.239","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"irodstest2.storage.surfsara.nl","server_pid":29206,"server_timestamp":"2023-12-08T09:58:57.605Z","server_type":"agent","server_zone":"igor"}



What are we doing wrong?
alanking commented 10 months ago

I was trying to reproduce this issue and was not able to. Here are some other things to try and to think about...

What is the ils -l output after iget completes in the case where it was trimmed from the cache resource? Did it get restaged?

Can you list the metadata attached to that data object at various points of the process? e.g. imeta ls -d test_20231208_60.txt

What happens when you try to target a specific replica with iget? Use iget -n0 and iget -n1 and see what kind of results you get.

Does your policy implementation include targeting a specific resource over another for gets/reads? In other words, is there a preference for eudatCache over eudatPnfs, or vice versa, when using iget with no arguments as you have done above?

cookie33 commented 9 months ago

The real error is as follows:

failed to fetch group name by resource and replica number

It normally does a query in the function storage_tiering::get_group_name_by_replica_number:

boost::format("SELECT META_DATA_ATTR_VALUE WHERE DATA_NAME = '%s' AND COLL_NAME = '%s' AND META_DATA_ATTR_NAME = '%s' AND META_DATA_ATTR_UNITS = '%s'")

And here it probably is as follows:

But it needs to use 1. if the query is updated to:

boost::format("SELECT META_DATA_ATTR_VALUE WHERE DATA_NAME = '%s' AND COLL_NAME = '%s' AND META_DATA_ATTR_NAME = '%s' AND META_DATA_ATTR_UNITS >= '%s'")

>= instead of =

or remove the META_DATA_ATTR_UNITS from the query.

cookie33 commented 9 months ago

preferred resources of our own rulefiles. (before core.re):

# set the default resource to eudat
acSetRescSchemeForCreate {
        on ($objPath like "/igor/eudat/*") {
            msiSetDefaultResc("eudatCache","forced");
        }
}

acSetRescSchemeForRepl {
        on ($objPath like "/igor/eudat/*") {
            msiSetDefaultResc("eudatCache","forced");
        }
}

acSetRescSchemeForCreate {msiSetDefaultResc("eudatCache","preferred"); }
acSetRescSchemeForRepl {msiSetDefaultResc("eudatCache","preferred"); }

And in core.re it states:

# grep SetResc /etc/irods/core.re | grep -v ^#
acSetRescSchemeForCreate {msiSetDefaultResc("demoResc","null"); }
acSetRescSchemeForRepl {msiSetDefaultResc("demoResc","null"); }
acRescQuotaPolicy {msiSetRescQuotaPolicy("off"); }
alanking commented 9 months ago

Okay, I think this is hitting a couple of different issues which happen to be philosophical problems for the storage tiering plugin. Congratulations on hitting the jackpot! :p

  1. I think removing the META_DATA_ATTR_UNITS filter is the right answer. Even though the replica numbers never "go down", I think checking for an ordered-ness using >= might lead to problems and doesn't accomplish anything more than just removing the META_DATA_ATTR_UNITS filter. This exposes an important limitation in this plugin, however: Data objects will only be able to have replicas in a single tiering group. We will need to document this limitation in the README.

  2. This fix would support the notion that accessing preserved replicas should trigger a restage (not just the replica being "tracked" in the metadata).

alanking commented 9 months ago

Because the underlying issue here is a failure to get the group name for a restage, we can set up a test like this:

  1. Three tiers: 0, 1, and 2
  2. Preserve replicas on tier 1
  3. Put a data object into tier 0 and tier out to tier 2 (the "tracked" replica should now be the one in tier 2)
  4. Run iget on the replica in tier 1
  5. The restage to tier 0 should occur successfully

With the bug described here, I would expect step 5 to fail because it wouldn't be able to identify the group for the replica in tier 1. With the proposed fix (removing the META_DATA_ATTR_UNITS filter), the group could be identified regardless of which replica is being "tracked".