Open DOC-MEX opened 3 months ago
You have only one machine? irods-development-2024:1247
?
I wonder if you've found an inconsistency with core.py.template
and the default core.re
...
We'll try to reproduce.
I have test it successfully on another machine with Ubuntu 20, this is the first time I try on an Ubuntu 22.
I did change update core.py since this time I have used a different name for my resource, in core.re I have this:
acSetRescSchemeForCreate {msiSetDefaultResc("grassrootsResc","null"); }
So in core.py I also used that name on this function:
def acSetRescSchemeForCreate(rule_args, callback, rei):
with open('../../etc/irods/server_config.json') as f:
server_config_dict = json.load(f)
callback.msiSetDefaultResc(server_config_dict.get('default_resource_name', 'grassrootsResc'), 'null')
I also used a simpler core.py from the beginners tutorial which only contains this:
#core.py
import sys
import exifread
from genquery import *
def pep_api_data_obj_put_post(rule_args, callback, rei):
import os
data_obj_inp = rule_args[2]
obj_path = str(data_obj_inp.objPath)
resc_hier = str(data_obj_inp.condInput['resc_hier'])
query_condition_string = f'COLL_NAME = \'{os.path.dirname(obj_path)}\' and ' \
f'DATA_NAME = \'{os.path.basename(obj_path)}\' and ' \
f'DATA_RESC_HIER = \'{resc_hier}\''
phypath = list(Query(callback, 'DATA_PATH', query_condition_string))[0]
exiflist = []
with open(phypath, 'rb') as f:
tags = exifread.process_file(f, details=False)
for (k, v) in tags.items():
if k not in ('JPEGThumbnail', 'TIFFThumbnail', 'Filename', 'EXIF MakerNote'):
exifpair = '{0}={1}'.format(k, v)
exiflist.append(exifpair)
exifstring = '%'.join(exiflist)
callback.add_metadata_to_objpath(exifstring, obj_path, '-d')
callback.writeLine('serverLog', 'PYTHON - pep_api_data_obj_put_post() complete')
But It also triggers the same error.
I was able to stand up 4.3.1 with PREP and core.py.template as core.py without issue.
You can look at $ iadmin get_delay_server_info
and $ iadmin h set_delay_server
to see if those are somehow related.
"Agent process [1967232] exited with status [11]."
Please confirm that your core.py file is based on the /etc/irods/core.py.template
shipped with the plugin package OR that you have the irods_rule_engine_plugin-cpp_default_policy
REP configured.
So, you will either need an item that looks like this in your configured rule engine plugins:
{
"instance_name": "irods_rule_engine_plugin-cpp_default_policy-instance",
"plugin_name": "irods_rule_engine_plugin-cpp_default_policy",
"plugin_specific_configuration": {}
}
...or, you will need to ensure that every required static PEP is implemented in core.py
.
Some background: The irods_rule_engine_plugin-cpp_default_policy
REP (or, "cpp backstop" as we call it) makes sure that all of the static PEPs used internally in the server are implemented. If some static PEP is not implemented in any of the configured rulebases, you can see a signal 11.
Yes, I am testing with the core.py.template that came with the installation of the plugin (everything I have installed has been using apt-get) as my core.py without any changes. The moment I add the stanza to the server_config file
{
"instance_name": "irods_rule_engine_plugin-python-instance",
"plugin_name": "irods_rule_engine_plugin-python",
"plugin_specific_configuration": {}
}
But that immediately triggers an exit with status 11 and SYS_HEADER_READ_LEN_ERR error.
This is the output of _get_delay_serverinfo and _iadmin h set_delayserver and the error logs.
ienv
irods_version - 4.3.1
irods_session_environment_file - /home/olverad/.irods/irods_environment.json.2050314
irods_environment_file - /home/olverad/.irods/irods_environment.json
irods_host - irods-development-2024
irods_port - 1247
irods_zone_name - zone_C
irods_user_name - test
olverad@irods-development-2024:/etc/irods$ iadmin get_delay_server_info
{
"leader": "irods-development-2024",
"successor": ""
}
olverad@irods-development-2024:/etc/irods$ iadmin h set_delay_server
set_delay_server HOSTNAME
Set the delay server for the local zone in R_GRID_CONFIGURATION.
The hostname entered will be saved as the 'successor'.
Each iRODS server will periodically check the catalog to determine
if it should promote itself to be the delay server for the local zone.
This mechanism allows for graceful delay server migration without downtime.
Error logs:
{"log_category":"legacy","log_level":"error","log_message":"_rcConnect: connectToRhost error, server on irods-development-2024:1247 is probably down status = -**4000 SYS_HEADER_READ_LEN_ERR**","server_host":"irods-development-2024","server_pid":2051529,"server_timestamp":"2024-04-11T07:29:11.191Z","server_type":"server","server_zone":"zone_C"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [2051627] **exited with status [11].**","server_host":"irods-development-2024","server_pid":2051530,"server_timestamp":"2024-04-11T07:29:11.211Z","server_type":"agent_factory","server_zone":"zone_C"}
{"log_category":"server","log_level":"error","log_message":"Caught exception in migrate_delay_server(): iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# std::__1::__function::__func<initServerMain(RsComm*, bool, bool)::$_13, std::__1::allocator<initServerMain(RsComm*, bool, bool)::$_13>, void ()>::operator()() at rodsServer.cpp:?\n 4# irods::experimental::cron::cron_task::operator()() in /usr/sbin/irodsServer\n 5# irods::experimental::cron::cron::run() in /usr/sbin/irodsServer\n 6# void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, main::$_10> >(void*) at rodsServer.cpp:?\n 7# 0x00007FF50E2B8AC3 in /lib/x86_64-linux-gnu/libc.so.6\n 8# 0x00007FF50E34A850 in /lib/x86_64-linux-gnu/libc.so.6\n\n","server_host":"irods-development-2024","server_pid":2051529,"server_timestamp":"2024-04-11T07:29:11.269Z","server_type":"server","server_zone":"zone_C"}
I have reinstalled irods on an Ubuntu 20 machine. This time I have used all the default values (zone and resource names, etc) But now I am having the same problem when trying to enable the python rule engine! The moment I add it the the server_config file, I get this errors:
{"log_category":"legacy","log_level":"error","log_message":"_rcConnect: connectToRhost error, server on grassroots-frontend-test:1247 is probably down status = -4000 SYS_HEADER_READ_LEN_ERR","server_host":"grassroots-frontend-test","server_pid":3262840,"server_timestamp":"2024-04-12T15:45:57.460Z","server_type":"server","server_zone":"tempZone"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [3263253] exited with status [11].","server_host":"grassroots-frontend-test","server_pid":3262841,"server_timestamp":"2024-04-12T15:45:57.469Z","server_type":"agent_factory","server_zone":"tempZone"}
{"log_category":"server","log_level":"error","log_message":"Caught exception in migrate_delay_server(): iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# std::1::function::func<initServerMain(RsComm, bool, bool)::$_13, std::__1::allocator<initServerMain(RsComm, bool, bool)::$_13>, void ()>::operator()() at rodsServer.cpp:?\n 4# irods::experimental::cron::cron_task::operator()() in /usr/sbin/irodsServer\n 5# irods::experimental::cron::cron::run() in /usr/sbin/irodsServer\n 6# void* std::1::thread_proxy<std::1::tuple<std::1::unique_ptr<std::1::thread_struct, std::1::default_delete
and this when I try to ils command:
remote addresses: 127.0.1.1 ERROR: [-] /irods_source/lib/core/src/sockComm.cpp:863:int connectToRhost(rcComm_t *, int, int) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header'] [-] /irods_source/lib/core/src/sockComm.cpp:611:irods::error readVersion(irods::network_object_ptr, version_t *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header'] [-] /irods_source/lib/core/src/sockComm.cpp:229:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t , struct timeval ) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header'] [-] /irods_source/plugins/network/src/tcp.cpp:190:irods::error tcp_read_msg_header(irods::plugin_context &, void , struct timeval *) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [only read [0] of [4]]
ERROR: _rcConnect: connectToRhost error, server on grassroots-frontend-test:1247 is probably down status = -4000 SYS_HEADER_READ_LEN_ERR
And when I remove it from the server_config file, everything is back to normal:
sudo su irods -c "/var/lib/irods/irodsctl status" irodsServer : Process 3262840 Process 3262841 irodsDelayServer : Process 3262858 grassroots-frontend-test:~$ ienv irods_version - 4.3.1 irods_session_environment_file - /home/olverad/.irods/irods_environment.json.3262346 irods_environment_file - /home/olverad/.irods/irods_environment.json irods_host - grassroots-frontend-test irods_port - 1247 irods_zone_name - tempZone irods_user_name - rods
grassroots-frontend-test:~$ ils /tempZone/home/rods:
I have no idea where to start looking for a possible problem, I thought it was because I was using Ubuntu 22, but now it also fails in Ubuntu 20.
@trel could you confirm what are the basics steps just for enabling the python rule engine on a fresh IRODS installation?
That should be all, correct? I have reinstalled irods 4.3.1 in an Ubuntu 20 machine, with all the default values, and as you can see it works perfectly fine.
sudo su irods -c "/var/lib/irods/irodsctl status" irodsServer : Process 2736 Process 2737 irodsDelayServer : Process 2754
olverac@opendata-20:~$ hostname opendata-20
olverac@opendata-20:~$ iadmin get_delay_server_info { "leader": "opendata-20", "successor": "" }
olverac@opendata-20:~$ ienv irods_version - 4.3.1 irods_session_environment_file - /home/olverac/.irods/irods_environment.json.17899 irods_environment_file - /home/olverac/.irods/irods_environment.json irods_host - opendata-20 irods_port - 1247 irods_zone_name - tempZone irods_user_name - rods
However, the moment I add the stanza to the config file: rule_engines": [ { "instance_name": "irods_rule_engine_plugin-python-instance", "plugin_name": "irods_rule_engine_plugin-python", "plugin_specific_configuration": {} }, {
It triggers this error:
{"log_category":"legacy","log_level":"error","log_message":"_rcConnect: connectToRhost error, server on opendata-20:1247 is probably down status = -4000 SYS_HEADER_READ_LEN_ERR","server_host":"opendata-20","server_pid":2736,"server_timestamp":"2024-04-17T13:27:18.618Z","server_type":"server","server_zone":"tempZone"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [18273] exited with status [11].","server_host":"opendata-20","server_pid":2737,"server_timestamp":"2024-04-17T13:27:18.638Z","server_type":"agent_factory","server_zone":"tempZone"}
{"log_category":"server","log_level":"error","log_message":"Caught exception in migrate_delay_server(): iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# 0x000000000044C167 in /usr/sbin/irodsServer\n 4# irods::experimental::cron::cron_task::operator()() in /usr/sbin/irodsServer\n 5# irods::experimental::cron::cron::run() in /usr/sbin/irodsServer\n 6# 0x000000000044AF21 in /usr/sbin/irodsServer\n 7# 0x00007F582755C609 in /lib/x86_64-linux-gnu/libpthread.so.0\n 8# clone in /lib/x86_64-linux-gnu/libc.so.6\n\n","server_host":"opendata-20","server_pid":2736,"server_timestamp":"2024-04-17T13:27:18.647Z","server_type":"server","server_zone":"tempZone"}
What other configuration might I be missing?
Using the irods_testing_environment, I was able to stand up a new iRODS 4.3.1 server with the python REP and an empty core.py file in an Ubuntu 22.04 container without issue. The logs are quiet.
Please post the following.
rule_engines
stanza of server_config.json/etc/irods/server_config.json
/var/lib/irods/.irods/irods_environment.json
hostname
in the terminalIf the config files contain sensitive information, please replace it or mask it.
Hi Kory here are the requested files:
server_config.json irods_environment.json error_logs.json
the value returned by hostname in the terminal is opendata-20
Another question, why I am able to use irods if I get this output:
sudo systemctl status irods ● irods.service - LSB: Manages the iRODS service Loaded: loaded (/etc/init.d/irods; generated) Active: failed (Result: exit-code) since Thu 2024-04-18 11:57:11 UTC; 48s ago Docs: man:systemd-sysv-generator(8) Process: 103984 ExecStart=/etc/init.d/irods start (code=exited, status=1/FAILURE)
Apr 18 11:57:10 opendata-20 systemd[1]: Starting LSB: Manages the iRODS service... Apr 18 11:57:10 opendata-20 su[103998]: (to irods) root on none Apr 18 11:57:10 opendata-20 su[103998]: pam_unix(su-l:session): session opened for user irods by (uid> Apr 18 11:57:11 opendata-20 su[103998]: pam_unix(su-l:session): session closed for user irods Apr 18 11:57:11 opendata-20 systemd[1]: irods.service: Control process exited, code=exited, status=1/> Apr 18 11:57:11 opendata-20 systemd[1]: irods.service: Failed with result 'exit-code'. Apr 18 11:57:11 opendata-20 systemd[1]: Failed to start LSB: Manages the iRODS service.
but this one seems OK sudo su irods -c "/var/lib/irods/irodsctl status" irodsServer : Process 2736 Process 2737 irodsDelayServer : Process 2754
but I cannot restart the irods service: sudo systemctl start irods Job for irods.service failed because the control process exited with error code.
iRODS doesn't ship an official systemd unit file yet. I recommend using the irodsctl script directly. You can't go wrong or get confused using that.
The log file only contains the error that you reported in this issue. Can you include the activity beginning with the initial launch of the server upto the failures (with and without the python REP)? I want to see everything leading up to (and including) the errors.
Here are the logs when I stop irodsctl, then start it and when I add the python rule engine stanza
The irods_environment.json file you provided is missing several pieces of information. Is that the file used by the iRODS service account (i.e. /var/lib/irods/.irods/irods_environment.json)?
Also, please list the contents of the /etc/irods directory and /etc/irods/core.py.
I don't see the python REP listed in the server_config.json file you posted. Please post the server_config.json which causes the error.
Have you tried standing up things on a different computer? VM? Container?
sorry I gave you the wrong irods_environment.json, this is the one!
sudo cat .irods/irods_environment.json { "irods_client_server_negotiation": "request_server_negotiation", "irods_client_server_policy": "CS_NEG_REFUSE", "irods_connection_pool_refresh_time_in_seconds": 300, "irods_cwd": "/tempZone/home/rods", "irods_default_hash_scheme": "SHA256", "irods_default_number_of_transfer_threads": 4, "irods_default_resource": "demoResc", "irods_encryption_algorithm": "AES-256-CBC", "irods_encryption_key_size": 32, "irods_encryption_num_hash_rounds": 16, "irods_encryption_salt_size": 8, "irods_home": "/tempZone/home/rods", "irods_host": "opendata-20", "irods_match_hash_policy": "compatible", "irods_maximum_size_for_single_buffer_in_megabytes": 32, "irods_port": 1247, "irods_server_control_plane_encryption_algorithm": "AES-256-CBC", "irods_server_control_plane_encryption_num_hash_rounds": 16, "irods_server_control_plane_key": "32_byte_server_control_plane_key", "irods_server_control_plane_port": 1248, "irods_transfer_buffer_size_for_parallel_transfer_in_megabytes": 4, "irods_user_name": "rods", "irods_zone_name": "tempZone", "schema_name": "service_account_environment", "schema_version": "v4" }
The stanza I add in server_config.json that triggers the error is this one
"rule_engines": [ { "instance_name": "irods_rule_engine_plugin-python-instance", "plugin_name": "irods_rule_engine_plugin-python", "plugin_specific_configuration": {} }, {
the content of /etc/irods : /etc/irods$ ls core.dvm core.py core.re server_config.json session_vars.py core.fnm core.py.template genquery.py service_account.config
and this is the core.py file I am using:
import datetime
import json
def writeStringToCharArray(s, char_array):
for i in range(0, len(s)):
char_array[i] = s[i]
def pythonRuleEnginePluginTest(rule_args, callback, rei):
with open('/tmp/from_core_py.txt', 'a') as f:
f.write(str(datetime.datetime.now()))
f.write('\n')
c = 0
for arg in rule_args:
f.write('\t')
f.write(str(c))
f.write(' : ')
f.write(str(arg))
f.write('\n')
c = c +1
callback.writeLine('serverLog', 'Printed to server log from python rule engine')
def acPreConnect(rule_args, callback, rei):
rule_args[0] = 'CS_NEG_REFUSE'
def acCreateUser(rule_args, callback, rei):
ret = callback.msiCreateUser()
if not ret['status']:
callback.msiRollback()
return ret
ret = acCreateDefaultCollections(rule_args, callback, rei)
if not ret['status']:
callback.msiRollback()
return ret
ret = callback.msiAddUserToGroup('public')
if not ret['status']:
callback.msiRollback()
return ret
callback.msiCommit()
def acCreateDefaultCollections(rule_args, callback, rei):
ret = acCreateUserZoneCollections(rule_args, callback, rei)
return ret
def acCreateUserZoneCollections(rule_args, callback, rei):
rodsZoneProxy = str(rei.uoip.rodsZone)
otherUserName = str(rei.uoio.userName)
homeStr = '/'.join(['', rodsZoneProxy, 'home'])
trashStr = '/'.join(['', rodsZoneProxy, 'trash', 'home'])
homeDict = [homeStr, otherUserName]
trashDict = [trashStr, otherUserName]
ret = acCreateCollByAdmin(homeDict, callback, rei)
ret = acCreateCollByAdmin(trashDict, callback, rei)
return ret
def acCreateCollByAdmin(rule_args, callback, rei):
ret = callback.msiCreateCollByAdmin(rule_args[0], rule_args[1])
return ret
def acDeleteUser(rule_args, callback, rei):
ret = callback.acDeleteDefaultCollections()
if not ret['status']:
callback.msiRollback()
return ret
ret = callback.msiDeleteUser()
if not ret['status']:
callback.msiRollback()
return ret
callback.msiCommit()
def acDeleteDefaultCollections(rule_args, callback, rei):
return acDeleteUserZoneCollections(rule_args, callback, rei)
def acDeleteUserZoneCollections(rule_args, callback, rei):
rodsZoneProxy = str(rei.uoip.rodsZone)
otherUserName = str(rei.uoio.userName)
homeStr = '/'.join(['', rodsZoneProxy, 'home'])
trashStr = '/'.join(['', rodsZoneProxy, 'trash', 'home'])
homeDict = [homeStr, otherUserName]
trashDict = [trashStr, otherUserName]
ret = acDeleteCollByAdminIfPresent(homeDict, callback, rei)
ret = acDeleteCollByAdminIfPresent(trashDict, callback, rei)
return ret
def acDeleteCollByAdminIfPresent(rule_args, callback, rei):
ret = callback.msiDeleteCollByAdmin(rule_args[0], rule_args[1])
if not ret['status']:
if not ret['code'] == -808000:
callback.failmsg(ret['code'], 'error deleting collection')
return ret
def acDeleteCollByAdmin(rule_args, callback, rei):
callback.msiDeleteCollByAdmin(rule_args[0], rule_args[1])
def acRenameLocalZone(rule_args, callback, rei):
coll_name = '/' + rule_args[0]
ret = callback.msiRenameCollection(coll_name, rule_args[1])
if not ret['status']:
callback.msiRollback()
return ret
ret = callback.msiRenameLocalZone(rule_args[0], rule_args[1])
if not ret['status']:
callback.msiRollback()
return ret
callback.msiCommit()
def acGetUserByDN(rule_args, callback, rei):
pass
def acAclPolicy(rule_args, callback, rei):
callback.msiAclPolicy('STRICT')
def acTicketPolicy(rule_args, callback, rei):
pass
def acCheckPasswordStrength(rule_args, callback, rei):
pass
def acSetRescSchemeForCreate(rule_args, callback, rei):
with open('../../etc/irods/server_config.json') as f:
server_config_dict = json.load(f)
callback.msiSetDefaultResc(server_config_dict.get('default_resource_name', 'demoResc'), 'null')
def acSetRescSchemeForRepl(rule_args, callback, rei):
with open('../../etc/irods/server_config.json') as f:
server_config_dict = json.load(f)
callback.msiSetDefaultResc(server_config_dict.get('default_resource_name', 'demoResc'), 'null')
def acPreprocForDataObjOpen(rule_args, callback, rei):
pass
def acSetMultiReplPerResc(rule_args, callback, rei):
pass
def acPostProcForPut(rule_args, callback, rei):
#pythonRuleEnginePluginTest(rule_args, callback, rei)
pass
def acPostProcForCopy(rule_args, callback, rei):
pass
def acPostProcForFilePathReg(rule_args, callback, rei):
pass
def acPostProcForCreate(rule_args, callback, rei):
pass
def acPostProcForOpen(rule_args, callback, rei):
pass
def acPostProcForPhymv(rule_args, callback, rei):
pass
def acPostProcForRepl(rule_args, callback, rei):
pass
def acSetNumThreads(rule_args, callback, rei):
callback.msiSetNumThreads('default', 'default', 'default')
def acDataDeletePolicy(rule_args, callback, rei):
pass
def acPostProcForDelete(rule_args, callback, rei):
pass
def acSetChkFilePathPerm(rule_args, callback, rei):
callback.msiSetChkFilePathPerm('disallowPathReg')
def acTrashPolicy(rule_args, callback, rei):
pass
def acSetPublicUserPolicy(rule_args, callback, rei):
pass
def acChkHostAccessControl(rule_args, callback, rei):
pass
def acSetVaultPathPolicy(rule_args, callback, rei):
callback.msiSetGraftPathScheme('no', '1')
def acSetReServerNumProc(rule_args, callback, rei):
callback.msiSetReServerNumProc('default')
def acPreprocForCollCreate(rule_args, callback, rei):
pass
def acPostProcForCollCreate(rule_args, callback, rei):
pass
def acPreprocForRmColl(rule_args, callback, rei):
pass
def acPostProcForRmColl(rule_args, callback, rei):
pass
def acPreProcForModifyUser(rule_args, callback, rei):
pass
def acPostProcForModifyUser(rule_args, callback, rei):
pass
def acPreProcForModifyAVUMetadata(rule_args, callback, rei):
pass
def acPostProcForModifyAVUMetadata(rule_args, callback, rei):
pass
def acPreProcForCreateUser(rule_args, callback, rei):
pass
def acPostProcForCreateUser(rule_args, callback, rei):
pass
def acPreProcForDeleteUser(rule_args, callback, rei):
pass
def acPostProcForDeleteUser(rule_args, callback, rei):
pass
def acPreProcForCreateResource(rule_args, callback, rei):
pass
def acPostProcForCreateResource(rule_args, callback, rei):
pass
def acPreProcForCreateToken(rule_args, callback, rei):
pass
def acPostProcForCreateToken(rule_args, callback, rei):
pass
def acPreProcForModifyUserGroup(rule_args, callback, rei):
pass
def acPostProcForModifyUserGroup(rule_args, callback, rei):
pass
def acPreProcForDeleteResource(rule_args, callback, rei):
pass
def acPostProcForDeleteResource(rule_args, callback, rei):
pass
def acPreProcForDeleteToken(rule_args, callback, rei):
pass
def acPostProcForDeleteToken(rule_args, callback, rei):
pass
def acPreProcForModifyResource(rule_args, callback, rei):
pass
def acPostProcForModifyResource(rule_args, callback, rei):
pass
def acPreProcForModifyCollMeta(rule_args, callback, rei):
pass
def acPostProcForModifyCollMeta(rule_args, callback, rei):
pass
def acPreProcForModifyDataObjMeta(rule_args, callback, rei):
pass
def acPostProcForModifyDataObjMeta(rule_args, callback, rei):
pass
def acPreProcForModifyAccessControl(rule_args, callback, rei):
pass
def acPostProcForModifyAccessControl(rule_args, callback, rei):
pass
def acPreProcForObjRename(rule_args, callback, rei):
pass
def acPostProcForObjRename(rule_args, callback, rei):
pass
def acPreProcForGenQuery(rule_args, callback, rei):
pass
def acPostProcForGenQuery(rule_args, callback, rei):
pass
def acRescQuotaPolicy(rule_args, callback, rei):
callback.msiSetRescQuotaPolicy('off')
def acBulkPutPostProcPolicy(rule_args, callback, rei):
callback.msiSetBulkPutPostProcPolicy('off')
def acPostProcForTarFileReg(rule_args, callback, rei):
pass
def acPostProcForDataObjWrite(rule_args, callback, rei):
pass
def acPostProcForDataObjRead(rule_args, callback, rei):
pass
def acPreProcForExecCmd(rule_args, callback, rei):
pass
def acPreProcForServerPortal(rule_args, callback, rei):
pass
def acPostProcForServerPortal(rule_args, callback, rei):
pass
def acPostProcForDataCopyReceived(rule_args, callback, rei):
pass
I have tried on 2 VMs with Ubuntu 20 and Ubuntu 22 and I get the same error. However, on previous installation a few months ago, it works with no problem. The problem seems to occur only on new installations.
Ok. Nothing stands out to me in your server_config.json or irods_environment.json file upon initial inspection.
Where did you add the Python REP in the rule engines stanza? Is it first in the list? Are the removing the iRODS rule language REP and default cpp REP or leaving them in? Have you seen the issue with an empty core.py?
Yes, I add the python REP first on the list, like this:
"rule_engines": [
{
"instance_name": "irods_rule_engine_plugin-python-instance",
"plugin_name": "irods_rule_engine_plugin-python",
"plugin_specific_configuration": {}
},
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
"plugin_specific_configuration": {
"re_data_variable_mapping_set": [
"core"
],
"re_function_name_mapping_set": [
"core"
],
"re_rulebase_set": [
"core",
"training"
],
"regexes_for_supported_peps": [
"ac[^ ]*",
"msi[^ ]*",
"[^ ]*pep_[^ ]*_(pre|post|except|finally)"
]
},
"shared_memory_instance": "irods_rule_language_rule_engine"
},
{
"instance_name": "irods_rule_engine_plugin-cpp_default_policy-instance",
"plugin_name": "irods_rule_engine_plugin-cpp_default_policy",
"plugin_specific_configuration": {}
}
]
},
then, to stop to the error messages and bring back irods service I only remove the Python REP and leave of the rest rule_engines as they were by default.
I get same error messages if I use an empty core.py
{"log_category":"delay_server","log_level":"error","log_message":"iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# main in irodsDelayServer\n 4# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n 5# _start in irodsDelayServer\n\n","server_host":"opendata-20","server_pid":106891,"server_timestamp":"2024-04-18T14:59:54.952Z","server_type":"delay_server","server_zone":"tempZone"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [116483] exited with status [11].","server_host":"opendata-20","server_pid":106874,"server_timestamp":"2024-04-18T14:59:54.968Z","server_type":"agent_factory","server_zone":"tempZone"}
{"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007FFA8721D420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /opt/irods-externals/clang-runtime13.0.0-0/lib/libc++.so.1\n 3# 0x00007FFA82C75B41\n 4# 0x00007FFA82C67764\n 5# 0x00007FFA822DE64C\n 6# 0x00007FFA82CB9080\n 7# irods::pluggable_rule_engine<std::__1::tuple<> >::start_operation(std::__1::tuple<>&) in /lib/libirods_server.so.4.3.1\n 8# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n 9# 0x00000000004429C6 in /usr/sbin/irodsServer\n10# main in /usr/sbin/irodsServer\n11# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n12# _start in /usr/sbin/irodsServer\n","server_host":"opendata-20","server_pid":106873,"server_timestamp":"2024-04-18T14:59:57.742Z","server_type":"server","server_zone":"tempZone","stacktrace_agent_pid":"116483","stacktrace_timestamp_epoch_milliseconds":"938","stacktrace_timestamp_epoch_seconds":"1713452394","stacktrace_timestamp_utc":"2024-04-18T14:59:54.938Z"}
Here are a few other things we can check...
Ensure that the iRODS Linux service account is able to read the core.py
file. I noticed that you said you use sudo cp
when copying the template file. Here's the output on an instance of iRODS 4.3.1 on an Ubuntu 22 container that I'm using (I think I did a sudo cp
as well, evidenced by the owner of core.py
):
$ ls -l /etc/irods
total 116
drwxrwxr-x 2 irods irods 4096 Feb 28 17:35 __pycache__
-rw-r--r-- 1 irods irods 5026 Feb 28 17:33 core.dvm
-rw-r--r-- 1 irods irods 831 Feb 28 17:33 core.fnm
-rw-r--r-- 1 root root 29 Feb 28 17:35 core.py
-rw-r--r-- 1 irods irods 8606 Oct 5 2023 core.py.template
-rw-r--r-- 1 irods irods 33386 Feb 28 17:33 core.re
-rw-r--r-- 1 irods irods 17656 Oct 5 2023 genquery.py
-rw------- 1 irods irods 4736 Feb 28 17:35 server_config.json
-rw-r--r-- 1 irods irods 64 Feb 28 17:33 service_account.config
-rw-r--r-- 1 irods irods 4876 Nov 29 2018 session_vars.py
Another thing I just noticed is that your listing of /etc/irods
was missing genquery.py
, which is supposed to be packaged with Python REP:
https://github.com/irods/irods_rule_engine_plugin_python/blob/ad836782d6bac3eec4276aa6c649c12de42dc3b9/CMakeLists.txt#L258-L264
This made me wonder the packages are being installed correctly. Can you confirm that the Python REP shared objects exist and have the correct permissions? Here's an example from an installation I have on an Ubuntu 22 container running iRODS 4.3.1...
$ ls -l /usr/lib/irods/plugins/rule_engines/
total 16052
-rw-r--r-- 1 root root 1137192 Oct 20 19:59 libirods_rule_engine_plugin-cpp_default_policy.so
-rw-r--r-- 1 root root 1590200 Oct 20 19:59 libirods_rule_engine_plugin-irods_rule_language.so
-rw-r--r-- 1 root root 528560 Oct 20 19:59 libirods_rule_engine_plugin-passthrough.so
-rw-r--r-- 1 root root 13169144 Oct 20 20:58 libirods_rule_engine_plugin-python.so
If all of this seems okay on your end, then we may have to start digging into how these VMs are being set up. You said it used to work a few months ago, but now it doesn't for new installations. We will need to think about what may have changed between now and then.
ls -l /usr/lib/irods/plugins/rule_engines/
total 16052
-rw-r--r-- 1 root root 1137192 Oct 20 19:59 libirods_rule_engine_plugin-cpp_default_policy.so
-rw-r--r-- 1 root root 1590200 Oct 20 19:59 libirods_rule_engine_plugin-irods_rule_language.so
-rw-r--r-- 1 root root 528560 Oct 20 19:59 libirods_rule_engine_plugin-passthrough.so
-rw-r--r-- 1 root root 13169144 Oct 20 20:58 libirods_rule_engine_plugin-python.so
And actually the file genquery.py is not missing, it was added when I installed the python rule engine. However, in /etc/irods there were some files that belong to root. I have changed all ownership to the irods user, this is what I have now:
$ ls -tr -all
total 112
-rw-r--r-- 1 irods irods 4876 Nov 29 2018 session_vars.py
-rw-r--r-- 1 irods irods 17656 Oct 5 2023 genquery.py
-rw-r--r-- 1 irods irods 8606 Oct 5 2023 core.py.template
-rw-r--r-- 1 irods irods 64 Apr 17 09:42 service_account.config
-rw-r--r-- 1 irods irods 831 Apr 17 09:43 core.fnm
-rw-r--r-- 1 irods irods 5026 Apr 17 09:43 core.dvm
-rw-r--r-- 1 irods irods 154 Apr 18 16:03 example.re
-rw-r--r-- 1 irods irods 33490 Apr 18 16:22 core.re
-rw-r--r-- 1 irods irods 0 Apr 19 07:55 core.py
drwxr-xr-x 106 root root 4096 Apr 19 08:07 ..
-rw------- 1 irods irods 4792 Apr 19 08:15 server_config.json
drwxr-xr-x 2 irods irods 4096 Apr 19 08:19 .
I have checked VM with the previous installation where the python rule engine works and I can see that all files in /etc/irods/ belong to the irods user as well.
/etc/irods$ ls -tr -all
total 144
-rw-r--r-- 1 irods irods 4876 Nov 29 2018 session_vars.py
-rw-r--r-- 1 irods irods 64 Sep 6 2023 service_account.config
-rw-r--r-- 1 irods irods 831 Sep 6 2023 core.fnm
-rw-r--r-- 1 irods irods 5026 Sep 6 2023 core.dvm
-rw-r--r-- 1 irods irods 33378 Sep 6 2023 core.re
-rw-r--r-- 1 irods irods 243 Sep 18 2023 training.re
-rw-r--r-- 1 irods irods 154 Sep 18 2023 example.re
-rw-r--r-- 1 irods irods 12325 Sep 25 2023 core.py
-rw-r--r-- 1 irods irods 17656 Oct 5 2023 genquery.py
-rw-r--r-- 1 irods irods 8606 Oct 5 2023 core.py.template
drwxrwxr-x 2 irods irods 4096 Feb 7 15:55 __pycache__
-rw------- 1 irods irods 5081 Mar 19 14:30 server_config.json
drwxr-xr-x 122 root root 12288 Apr 17 06:28 ..
drwxr-xr-x 3 irods irods 4096 Apr 19 08:18 .
Unfortunately, I get the same errors when I add python stanza, even when I use an empty core.py
{"log_category":"legacy","log_level":"error","log_message":"_rcConnect: connectToRhost error, server on opendata-20:1247 is probably down status = -4000 SYS_HEADER_READ_LEN_ERR","server_host":"opendata-20","server_pid":183861,"server_timestamp":"2024-04-19T08:15:03.337Z","server_type":"server","server_zone":"tempZone"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [184391] exited with status [11].","server_host":"opendata-20","server_pid":183862,"server_timestamp":"2024-04-19T08:15:03.352Z","server_type":"agent_factory","server_zone":"tempZone"}
I wonder what other permission and folders should I check? or a simple test to know that the python rule engine was installed correctly and has everything it needs, maybe it is something related to the the python interpreter?
Oh, my apologies. Now I see genquery.py in your output from before. Sorry :)
Try removing the training
rulebase from the iRODS rule language REP. Here's what your rule_engines
configuration should look like verbatim:
"rule_engines": [
{
"instance_name": "irods_rule_engine_plugin-python-instance",
"plugin_name": "irods_rule_engine_plugin-python",
"plugin_specific_configuration": {
}
},
{
"instance_name": "irods_rule_engine_plugin-irods_rule_language-instance",
"plugin_name": "irods_rule_engine_plugin-irods_rule_language",
"plugin_specific_configuration": {
"re_data_variable_mapping_set": [
"core"
],
"re_function_name_mapping_set": [
"core"
],
"re_rulebase_set": [
"core"
],
"regexes_for_supported_peps": [
"ac[^ ]*",
"msi[^ ]*",
"[^ ]*pep_[^ ]*_(pre|post|except|finally)"
]
},
"shared_memory_instance": "irods_rule_language_rule_engine"
},
{
"instance_name": "irods_rule_engine_plugin-cpp_default_policy-instance",
"plugin_name": "irods_rule_engine_plugin-cpp_default_policy",
"plugin_specific_configuration": {}
}
]
That is truly the default, vanilla, not-particularly-useful configuration. From there, we can try adding some functionality if that holds stable.
An additional thing to try so that we can feel out the edges of the problem: Remove /etc/irods/core.py
while leaving the Python REP configured as shown above (again, verbatim) and try to run ils
. You should get the following error:
$ ils
remote addresses: 127.0.0.1 ERROR: connectToRhost: error returned from host localhost status = -38000 SYS_AGENT_INIT_ERR
ERROR: _rcConnect: connectToRhost error, server on localhost:1247 is probably down status = -38000 SYS_AGENT_INIT_ERR
We can also try to increase the log_level
of agent
, server
, and rule_engine
to debug
or even trace
and see if the server log has any additional information before the crash occurs. I don't expect much, though.
These are the logs that I get when I increase the level to "debug" and enabled the python rule engine without having the /etc/irods/core.py
{"log_category":"agent","log_level":"debug","log_message":"Unpacking byte buffer based on packing instruction [GridConfigurationInp_PI]","request_api_name":"","request_api_number":20009,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79543,"server_timestamp":"2024-04-25T09:18:39.353Z","server_type":"agent","server_zone":"earlhamZone"}
{"log_category":"agent","log_level":"debug","log_message":"[close_all_l1_descriptors:949] Closing all L1 descriptors ...","request_api_name":"","request_api_number":20009,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79543,"server_timestamp":"2024-04-25T09:18:39.354Z","server_type":"agent","server_zone":"earlhamZone"}
{"log_category":"agent","log_level":"debug","log_message":"Agent [79543] exiting with status = 0","request_api_name":"","request_api_number":20009,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79543,"server_timestamp":"2024-04-25T09:18:39.355Z","server_type":"agent","server_zone":"earlhamZone"}
{"dynamic_pep_rule_regex":"[^ ]*pep_[^ ]*_(pre|post)","instance_name":"irods_rule_engine_plugin-python-instance","log_category":"rule_engine","log_level":"debug","log_message":"No regexes found in server_config for Python RE - using default regexes","rule_engine_plugin":"python","server_host":"opendata-20","server_pid":79547,"server_timestamp":"2024-04-25T09:18:44.472Z","server_type":"agent","server_zone":"earlhamZone","static_pep_rule_regex":"ac[^ ]*"}
{"log_category":"legacy","log_level":"error","log_message":"_rcConnect: connectToRhost error, server on opendata-20:1247 is probably down status = -4000 SYS_HEADER_READ_LEN_ERR","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T09:18:44.479Z","server_type":"server","server_zone":"earlhamZone"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [79547] exited with status [11].","server_host":"opendata-20","server_pid":4163,"server_timestamp":"2024-04-25T09:18:44.500Z","server_type":"agent_factory","server_zone":"earlhamZone"}
{"log_category":"server","log_level":"error","log_message":"Caught exception in migrate_delay_server(): iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# std::__1::__function::__func<initServerMain(RsComm*, bool, bool)::$_13, std::__1::allocator<initServerMain(RsComm*, bool, bool)::$_13>, void ()>::operator()() at rodsServer.cpp:?\n 4# irods::experimental::cron::cron_task::operator()() in /usr/sbin/irodsServer\n 5# irods::experimental::cron::cron::run() in /usr/sbin/irodsServer\n 6# void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, main::$_10> >(void*) at rodsServer.cpp:?\n 7# 0x00007F1A2F87A609 in /lib/x86_64-linux-gnu/libpthread.so.0\n 8# clone in /lib/x86_64-linux-gnu/libc.so.6\n\n","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T09:18:44.543Z","server_type":"server","server_zone":"earlhamZone"}
{"dynamic_pep_rule_regex":"[^ ]*pep_[^ ]*_(pre|post)","instance_name":"irods_rule_engine_plugin-python-instance","log_category":"rule_engine","log_level":"debug","log_message":"No regexes found in server_config for Python RE - using default regexes","rule_engine_plugin":"python","server_host":"opendata-20","server_pid":79561,"server_timestamp":"2024-04-25T09:18:49.649Z","server_type":"agent","server_zone":"earlhamZone","static_pep_rule_regex":"ac[^ ]*"}
{"log_category":"legacy","log_level":"error","log_message":"_rcConnect: connectToRhost error, server on opendata-20:1247 is probably down status = -4000 SYS_HEADER_READ_LEN_ERR","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T09:18:49.655Z","server_type":"server","server_zone":"earlhamZone"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [79561] exited with status [11].","server_host":"opendata-20","server_pid":4163,"server_timestamp":"2024-04-25T09:18:49.684Z","server_type":"agent_factory","server_zone":"earlhamZone"}
{"log_category":"server","log_level":"error","log_message":"Caught exception in migrate_delay_server(): iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# std::__1::__function::__func<initServerMain(RsComm*, bool, bool)::$_13, std::__1::allocator<initServerMain(RsComm*, bool, bool)::$_13>, void ()>::operator()() at rodsServer.cpp:?\n 4# irods::experimental::cron::cron_task::operator()() in /usr/sbin/irodsServer\n 5# irods::experimental::cron::cron::run() in /usr/sbin/irodsServer\n 6# void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, main::$_10> >(void*) at rodsServer.cpp:?\n 7# 0x00007F1A2F87A609 in /lib/x86_64-linux-gnu/libpthread.so.0\n 8# clone in /lib/x86_64-linux-gnu/libc.so.6\n\n","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T09:18:49.724Z","server_type":"server","server_zone":"earlhamZone"}
{"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007F1A2F886420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /opt/irods-externals/clang-runtime13.0.0-0/lib/libc++.so.1\n 3# 0x00007F1A2B2DEB41\n 4# 0x00007F1A2B2D0764\n 5# 0x00007F1A2A94764C\n 6# 0x00007F1A2B322080\n 7# irods::pluggable_rule_engine<std::__1::tuple<> >::start_operation(std::__1::tuple<>&) in /lib/libirods_server.so.4.3.1\n 8# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n 9# main::$_5::operator()() const at rodsServer.cpp:?\n10# main in /usr/sbin/irodsServer\n11# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n12# _start in /usr/sbin/irodsServer\n","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T09:18:51.475Z","server_type":"server","server_zone":"earlhamZone","stacktrace_agent_pid":"79561","stacktrace_timestamp_epoch_milliseconds":"654","stacktrace_timestamp_epoch_seconds":"1714036729","stacktrace_timestamp_utc":"2024-04-25T09:18:49.654Z"}
{"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007F1A2F886420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /opt/irods-externals/clang-runtime13.0.0-0/lib/libc++.so.1\n 3# 0x00007F1A2B2DEB41\n 4# 0x00007F1A2B2D0764\n 5# 0x00007F1A2A94764C\n 6# 0x00007F1A2B322080\n 7# irods::pluggable_rule_engine<std::__1::tuple<> >::start_operation(std::__1::tuple<>&) in /lib/libirods_server.so.4.3.1\n 8# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n 9# main::$_5::operator()() const at rodsServer.cpp:?\n10# main in /usr/sbin/irodsServer\n11# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n12# _start in /usr/sbin/irodsServer\n","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T09:18:51.585Z","server_type":"server","server_zone":"earlhamZone","stacktrace_agent_pid":"79547","stacktrace_timestamp_epoch_milliseconds":"477","stacktrace_timestamp_epoch_seconds":"1714036724","stacktrace_timestamp_utc":"2024-04-25T09:18:44.477Z"}
{"log_category":"agent","log_level":"debug","log_message":"Unpacking byte buffer based on packing instruction [BinBytesBuf_PI]","request_api_name":"","request_api_number":110000,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79613,"server_timestamp":"2024-04-25T09:18:54.926Z","server_type":"agent","server_zone":"earlhamZone"}
{"log_category":"agent","log_level":"debug","log_message":"Unpacking byte buffer based on packing instruction [GridConfigurationInp_PI]","request_api_name":"","request_api_number":20009,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79613,"server_timestamp":"2024-04-25T09:18:54.953Z","server_type":"agent","server_zone":"earlhamZone"}
{"log_category":"agent","log_level":"debug","log_message":"[close_all_l1_descriptors:949] Closing all L1 descriptors ...","request_api_name":"","request_api_number":20009,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79613,"server_timestamp":"2024-04-25T09:18:54.954Z","server_type":"agent","server_zone":"earlhamZone"}
Changing the log level of agent to "debug" keeps printing these logs:
{"log_category":"agent","log_level":"debug","log_message":"Unpacking byte buffer based on packing instruction [BinBytesBuf_PI]","request_api_name":"","request_api_number":110000,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79613,"server_timestamp":"2024-04-25T09:18:54.926Z","server_type":"agent","server_zone":"earlhamZone"}
{"log_category":"agent","log_level":"debug","log_message":"Unpacking byte buffer based on packing instruction [GridConfigurationInp_PI]","request_api_name":"","request_api_number":20009,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79613,"server_timestamp":"2024-04-25T09:18:54.953Z","server_type":"agent","server_zone":"earlhamZone"}
{"log_category":"agent","log_level":"debug","log_message":"[close_all_l1_descriptors:949] Closing all L1 descriptors ...","request_api_name":"","request_api_number":20009,"request_api_version":"d","request_client_user":"rods","request_host":"127.0.0.1","request_proxy_user":"rods","request_release_version":"rods4.3.1","server_host":"opendata-20","server_pid":79613,"server_timestamp":"2024-04-25T09:18:54.954Z","server_type":"agent","server_zone":"earlhamZone"}
So I keep it to agent in info level.
These are the logs when I copy the core.py.template to core.py. As you can see, the errors are still the same
{"dynamic_pep_rule_regex":"[^ ]*pep_[^ ]*_(pre|post)","instance_name":"irods_rule_engine_plugin-python-instance","log_category":"rule_engine","log_level":"debug","log_message":"No regexes found in server_config for Python RE - using default regexes","rule_engine_plugin":"python","server_host":"opendata-20","server_pid":83698,"server_timestamp":"2024-04-25T10:21:18.224Z","server_type":"agent","server_zone":"earlhamZone","static_pep_rule_regex":"ac[^ ]*"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [83698] exited with status [11].","server_host":"opendata-20","server_pid":4163,"server_timestamp":"2024-04-25T10:21:18.246Z","server_type":"agent_factory","server_zone":"earlhamZone"}
{"log_category":"delay_server","log_level":"error","log_message":"iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# main in irodsDelayServer\n 4# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n 5# _start in irodsDelayServer\n\n","server_host":"opendata-20","server_pid":4181,"server_timestamp":"2024-04-25T10:21:18.271Z","server_type":"delay_server","server_zone":"earlhamZone"}
{"dynamic_pep_rule_regex":"[^ ]*pep_[^ ]*_(pre|post)","instance_name":"irods_rule_engine_plugin-python-instance","log_category":"rule_engine","log_level":"debug","log_message":"No regexes found in server_config for Python RE - using default regexes","rule_engine_plugin":"python","server_host":"opendata-20","server_pid":83706,"server_timestamp":"2024-04-25T10:21:18.961Z","server_type":"agent","server_zone":"earlhamZone","static_pep_rule_regex":"ac[^ ]*"}
{"log_category":"legacy","log_level":"error","log_message":"_rcConnect: connectToRhost error, server on opendata-20:1247 is probably down status = -4000 SYS_HEADER_READ_LEN_ERR","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T10:21:18.968Z","server_type":"server","server_zone":"earlhamZone"}
{"log_category":"agent_factory","log_level":"error","log_message":"Agent process [83706] exited with status [11].","server_host":"opendata-20","server_pid":4163,"server_timestamp":"2024-04-25T10:21:18.990Z","server_type":"agent_factory","server_zone":"earlhamZone"}
{"log_category":"server","log_level":"error","log_message":"Caught exception in migrate_delay_server(): iRODS Exception:\n file: /irods_source/lib/core/src/client_connection.cpp\n function: void irods::experimental::client_connection::only_connect(const std::string &, const int, const irods::experimental::fully_qualified_username &)\n line: 188\n code: -305000 (USER_SOCK_CONNECT_ERR)\n message:\n Connect error\nstack trace:\n--------------\n 0# irods::stacktrace::dump() const in /lib/libirods_common.so.4.3.1\n 1# irods::exception::assemble_full_display_what() const in /lib/libirods_common.so.4.3.1\n 2# irods::exception::what() const in /lib/libirods_common.so.4.3.1\n 3# std::__1::__function::__func<initServerMain(RsComm*, bool, bool)::$_13, std::__1::allocator<initServerMain(RsComm*, bool, bool)::$_13>, void ()>::operator()() at rodsServer.cpp:?\n 4# irods::experimental::cron::cron_task::operator()() in /usr/sbin/irodsServer\n 5# irods::experimental::cron::cron::run() in /usr/sbin/irodsServer\n 6# void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, main::$_10> >(void*) at rodsServer.cpp:?\n 7# 0x00007F1A2F87A609 in /lib/x86_64-linux-gnu/libpthread.so.0\n 8# clone in /lib/x86_64-linux-gnu/libc.so.6\n\n","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T10:21:19.032Z","server_type":"server","server_zone":"earlhamZone"}
{"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007F1A2F886420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /opt/irods-externals/clang-runtime13.0.0-0/lib/libc++.so.1\n 3# 0x00007F1A2B2DEB41\n 4# 0x00007F1A2B2D0764\n 5# 0x00007F1A2A94764C\n 6# 0x00007F1A2B322080\n 7# irods::pluggable_rule_engine<std::__1::tuple<> >::start_operation(std::__1::tuple<>&) in /lib/libirods_server.so.4.3.1\n 8# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n 9# main::$_5::operator()() const at rodsServer.cpp:?\n10# main in /usr/sbin/irodsServer\n11# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n12# _start in /usr/sbin/irodsServer\n","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T10:21:20.392Z","server_type":"server","server_zone":"earlhamZone","stacktrace_agent_pid":"83698","stacktrace_timestamp_epoch_milliseconds":"231","stacktrace_timestamp_epoch_seconds":"1714040478","stacktrace_timestamp_utc":"2024-04-25T10:21:18.231Z"}
{"log_category":"server","log_level":"critical","log_message":" 0# stacktrace_signal_handler in /lib/libirods_server.so.4.3.1\n 1# 0x00007F1A2F886420 in /lib/x86_64-linux-gnu/libpthread.so.0\n 2# std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::basic_string(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) in /opt/irods-externals/clang-runtime13.0.0-0/lib/libc++.so.1\n 3# 0x00007F1A2B2DEB41\n 4# 0x00007F1A2B2D0764\n 5# 0x00007F1A2A94764C\n 6# 0x00007F1A2B322080\n 7# irods::pluggable_rule_engine<std::__1::tuple<> >::start_operation(std::__1::tuple<>&) in /lib/libirods_server.so.4.3.1\n 8# runIrodsAgentFactory(sockaddr_un) in /lib/libirods_server.so.4.3.1\n 9# main::$_5::operator()() const at rodsServer.cpp:?\n10# main in /usr/sbin/irodsServer\n11# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6\n12# _start in /usr/sbin/irodsServer\n","server_host":"opendata-20","server_pid":4162,"server_timestamp":"2024-04-25T10:21:20.535Z","server_type":"server","server_zone":"earlhamZone","stacktrace_agent_pid":"83706","stacktrace_timestamp_epoch_milliseconds":"966","stacktrace_timestamp_epoch_seconds":"1714040478","stacktrace_timestamp_utc":"2024-04-25T10:21:18.966Z"}
I'm trying to enable the python rule engine on a fresh installation of Irods 4.3.1 in an Ubuntu 22 machine. I am using the core.py.templete as my core.py, the moment I enable the python engine in the server_config.json file. I get these errors. They stop as soon as I remove the python rule engine stanza from the config file.
any idea what could the problem? I installed the python rule engine using apt-get. Could it be a compatibility issue related to the newest IRODS version and the the python rule engine?