Open mikkonie opened 1 year ago
First issue when upgrading on an already installed iCAT:
Error encountered running irods_control start:
Traceback (most recent call last):
File "/var/lib/irods/scripts/irods/json_validation.py", line 60, in validate_dict
jsonschema.validate(config_dict, schema, resolver=jsonschema.RefResolver(schema_uri, schema))
File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 541, in validate
cls(schema, *args, **kwargs).validate(instance)
File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 130, in validate
raise error
jsonschema.exceptions.ValidationError: {'catalog_schema_version': 1, 'commit_id': '0000000000000000000000000000000000000000', 'configuration_schema_version': 2, 'irods_version': '4.1.0', 'schema_name': 'VERSION', 'schema_version': 'v2'} is valid under each of {'type': 'object', 'properties': {'catalog_schema_version': {'type': 'integer'}, 'commit_id': {'type': 'string', 'pattern': '^[0-9a-f]{40}$'}, 'configuration_schema_version': {'type': 'integer'}, 'installation_time': {'type': 'string', 'format': 'date-time'}, 'irods_version': {'type': 'string'}, 'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}}, 'required': ['catalog_schema_version', 'commit_id', 'configuration_schema_version', 'irods_version']}, {'$ref': '#'}
Failed validating 'oneOf' in schema['properties']['previous_version']:
{'oneOf': [{'$ref': '#'},
{'properties': {'catalog_schema_version': {'type': 'integer'},
'commit_id': {'pattern': '^[0-9a-f]{40}$',
'type': 'string'},
'configuration_schema_version': {'type': 'integer'},
'installation_time': {'format': 'date-time',
'type': 'string'},
'irods_version': {'type': 'string'},
'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}},
'required': ['catalog_schema_version',
'commit_id',
'configuration_schema_version',
'irods_version'],
'type': 'object'}]}
On instance['previous_version']:
{'catalog_schema_version': 1,
'commit_id': '0000000000000000000000000000000000000000',
'configuration_schema_version': 2,
'irods_version': '4.1.0',
'schema_name': 'VERSION',
'schema_version': 'v2'}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/lib/irods/scripts/irods_control.py", line 124, in main
operations_dict[operation]()
File "/var/lib/irods/scripts/irods_control.py", line 70, in <lambda>
operations_dict['start'] = lambda: irods_controller.start(write_to_stdout=options.write_to_stdout, test_mode=options.test_mode)
File "/var/lib/irods/scripts/irods/controller.py", line 94, in start
self.config.validate_configuration()
File "/var/lib/irods/scripts/irods/configuration.py", line 286, in validate_configuration
config_file['path'])
File "/var/lib/irods/scripts/irods/json_validation.py", line 79, in validate_dict
sys.exc_info()[2])
File "/var/lib/irods/scripts/irods/six.py", line 671, in reraise
raise value.with_traceback(tb)
File "/var/lib/irods/scripts/irods/json_validation.py", line 60, in validate_dict
jsonschema.validate(config_dict, schema, resolver=jsonschema.RefResolver(schema_uri, schema))
File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 541, in validate
cls(schema, *args, **kwargs).validate(instance)
File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 130, in validate
raise error
The aforementioned crash also breaks existing server configuration in a way wihch prevents downgrading. This is very bad.
Even after fixing all issues, we should backup all server configs before attempting this upgrade in production.
Clean install also fails. Apparently this will need a lot of work.
irods-test_1 | Traceback (most recent call last):
irods-test_1 | File "/var/lib/irods/scripts/setup_irods.py", line 58, in <module>
irods-test_1 | import irods.lib
irods-test_1 | File "/var/lib/irods/scripts/irods/lib.py", line 15, in <module>
irods-test_1 | import distro
irods-test_1 | ImportError: No module named distro
irods_1 | Perform iRODS setup
irods_1 | Traceback (most recent call last):
irods_1 | File "/var/lib/irods/scripts/setup_irods.py", line 58, in <module>
irods_1 | import irods.lib
irods_1 | File "/var/lib/irods/scripts/irods/lib.py", line 15, in <module>
irods_1 | import distro
irods_1 | ImportError: No module named distro
irods-test_1 | Password:
postgres_1 | 2023-01-31 12:02:36.256 UTC [91] ERROR: database "ICAT_TEST" already exists
postgres_1 | 2023-01-31 12:02:36.256 UTC [91] STATEMENT: CREATE DATABASE "ICAT_TEST";
irods-test_1 | createdb: database creation failed: ERROR: database "ICAT_TEST" already exists
irods_1 | Password:
postgres_1 | 2023-01-31 12:02:36.267 UTC [92] ERROR: database "ICAT" already exists
postgres_1 | 2023-01-31 12:02:36.267 UTC [92] STATEMENT: CREATE DATABASE "ICAT";
irods_1 | createdb: database creation failed: ERROR: database "ICAT" already exists
sodar-docker-compose-dev_irods-test_1 exited with code 1
sodar-docker-compose-dev_irods_1 exited with code 1
Got past the prior crash, here are some new ones.
Edit: The 1st one was fixed.
irods_1 | rsyslogd: imklog: cannot open kernel log (/proc/kmsg): Operation not permitted.
irods_1 | rsyslogd: activation of module imklog failed [v8.32.0 try http://www.rsyslog.com/e/2145 ]
irods_1 | ...done.
This one persists at the time of writing:
irods_1 | Traceback (most recent call last):
irods_1 | File "/var/lib/irods/scripts/setup_irods.py", line 529, in <module>
irods_1 | sys.exit(main())
irods_1 | File "/var/lib/irods/scripts/setup_irods.py", line 517, in main
irods_1 | test_mode=options.test_mode)
irods_1 | File "/var/lib/irods/scripts/setup_irods.py", line 110, in setup_server
irods_1 | default_resource_name = json_configuration_dict['default_resource_name']
irods_1 | KeyError: 'default_resource_name'
Looks like the unattended config file template needs to be updated. Will be looking into the original.
Unattended configuration file updated to match the current schema. This leads to the following error:
irods_1 | Error encountered running setup_irods:
irods_1 | Traceback (most recent call last):
irods_1 | File "/var/lib/irods/scripts/setup_irods.py", line 517, in main
irods_1 | test_mode=options.test_mode)
irods_1 | File "/var/lib/irods/scripts/setup_irods.py", line 150, in setup_server
irods_1 | test_put(irods_config)
irods_1 | File "/var/lib/irods/scripts/setup_irods.py", line 180, in test_put
irods_1 | raise IrodsError('Post-install test failed. Please check your configuration.')
irods_1 | irods.exceptions.IrodsError: Post-install test failed. Please check your configuration.
Additional info in setup_log.txt
about the aforementioned crash. Looks like a PAM plugin issue. Oh great, I'm sure this will not be a pain to fix.
+---------------------------+
| Running Post-Install Test |
+---------------------------+
2023-01-31T15:36:43.765Z - DEBUG - execute.py: 52 - Calling ['/usr/sbin/irodsTestPutGet'] with options:
{'shell': False, 'stderr': -1, 'stdout': -1}
2023-01-31T15:36:44.046Z - DEBUG - execute.py: 37 - Command /usr/sbin/irodsTestPutGet returned with code -6.
stderr:
Error occurred while authenticating user [rods] [PLUGIN_ERROR_MISSING_SHARED_OBJECT: [-] /irods_source/lib/core/include/irods/irods_load_plugin.hpp:157:irods::error irods::load_plugin(PluginType *&, const std::string &, const std::string &, const std::string &, const Ts &...) [PluginType = irods::experimental::auth::authentication_base, Ts = <char [14]>] : status [PLUGIN_ERROR_MISSING_SHARED_OBJECT] errno [] -- message [shared library does not exist [/usr/lib/irods/plugins/auth/libirods_auth_plugin-pam_client.so]]
] [ec=-1827000] failed with error -1827000 PLUGIN_ERROR_MISSING_SHARED_OBJECT
libc++abi: terminating with uncaught exception of type std::runtime_error: client login error
2023-01-31T15:36:44.046Z - ERROR - setup_irods.py: 519 - Error encountered running setup_irods:
Traceback (most recent call last):
File "/var/lib/irods/scripts/setup_irods.py", line 517, in main
test_mode=options.test_mode)
File "/var/lib/irods/scripts/setup_irods.py", line 150, in setup_server
test_put(irods_config)
File "/var/lib/irods/scripts/setup_irods.py", line 180, in test_put
raise IrodsError('Post-install test failed. Please check your configuration.')
irods.exceptions.IrodsError: Post-install test failed. Please check your configuration.
2023-01-31T15:36:44.047Z - INFO - setup_irods.py: 520 - Exiting...
Just a note, the previous PAM error was fixed with the help of iRODS support. The syntax for PAM auth in configurations has changed. Instead of PAM
it now expects pam_password
.
The blocker right now is the 4.3 API or Python client used by SODAR not working correctly with the iRODS server. Will look into that when I have time. May also consider waiting for 4.3.1 to come out.
Server currently works with a clean install. SODAR auth via the custom PAM module is no longer working. I need to look into what has changed in the iRODS auth and attempt to update my custom module accordingly.
Currently the containers can be destroyed by a problem with version.json
, which is apparently written by setup and isn't included in the volumes. Only rebuilding the entire image fixes this. I'm trying to figure out what causes this.
This happens both in iRODS start and setup, so clearing the volumes and re-initializing everything will not help.
Error encountered running irods_control start:
Traceback (most recent call last):
File "/var/lib/irods/scripts/irods/json_validation.py", line 60, in validate_dict
jsonschema.validate(config_dict, schema, resolver=jsonschema.RefResolver(schema_uri, schema))
File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 541, in validate
cls(schema, *args, **kwargs).validate(instance)
File "/usr/lib/python3/dist-packages/jsonschema/validators.py", line 130, in validate
raise error
jsonschema.exceptions.ValidationError: {'catalog_schema_version': 1, 'commit_id': '0000000000000000000000000000000000000000', 'configuration_schema_version': 2, 'irods_version': '4.1.0', 'schema_name': 'VERSION', 'schema_version': 'v2'} is valid under each of {'type': 'object', 'properties': {'catalog_schema_version': {'type': 'integer'}, 'commit_id': {'type': 'string', 'pattern': '^[0-9a-f]{40}$'}, 'configuration_schema_version': {'type': 'integer'}, 'installation_time': {'type': 'string', 'format': 'date-time'}, 'irods_version': {'type': 'string'}, 'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}}, 'required': ['catalog_schema_version', 'commit_id', 'configuration_schema_version', 'irods_version']}, {'$ref': '#'}
Failed validating 'oneOf' in schema['properties']['previous_version']:
{'oneOf': [{'$ref': '#'},
{'properties': {'catalog_schema_version': {'type': 'integer'},
'commit_id': {'pattern': '^[0-9a-f]{40}$',
'type': 'string'},
'configuration_schema_version': {'type': 'integer'},
'installation_time': {'format': 'date-time',
'type': 'string'},
'irods_version': {'type': 'string'},
'previous_version': {'$ref': '#/properties/previous_version/oneOf/1'}},
'required': ['catalog_schema_version',
'commit_id',
'configuration_schema_version',
'irods_version'],
'type': 'object'}]}
On instance['previous_version']:
{'catalog_schema_version': 1,
'commit_id': '0000000000000000000000000000000000000000',
'configuration_schema_version': 2,
'irods_version': '4.1.0',
'schema_name': 'VERSION',
'schema_version': 'v2'}
Update: This error occurs (at least) when we recreate the image on an already provisioned environment. It seems we need to add some more directories to persistent storage via config/volumes. It's possible this same problem also exists in the 4.2 branch, but in any case we should be able to handle an image update on a provisioned server.
Fixed the problem with version.json
: we just have to copy it to /etc/irods
after provisioning and copy it back to /var/irods/lib
if running on a provisioned server.
iRODS 4.3 uses rsyslog for logging. Hence syslog logging needs to be set up. One example of how to do this is here.
Starting to look into this again to hopefully finalize this image soon and work towards getting it deployed with SODAR.
While I was on sick leave, iRODS v4.3.2 was released. First thing is to upgrade to that and see if previously working things are still OK.
As I kind of expected, upgrading the target iRODS version from 4.3.1 to 4.3.2 does not work on the fly. The server stays up for a short while and performs actions successfully, but then it dies. Same thing after restart.
I need to get logging up and try to see what could be causing this. 4.3.1 was working just fine for me locally.
This may have something to do with the python-irodsclient
version in use, maybe a bad request breaks the server. But this is simply a hunch. Upgrading to a newer version has its issues as well, see bihealth/sodar-server#1955.
Back at it again. It seems installing iRODS itself has changed at some point.
irods-runtime
must explicitly be installed now as a dependency for irods-server
and irods-dev
?
irods-rule-engine-plugin-python
requires irods-runtime=4.3.3
~~
After fixing build issues, iRODS startup fails when running the container:
irods-1 | Start iRODS
irods-1 | Test iinit
irods-1 | /irods_login.sh: line 3: iinit: command not found
irods-1 | iinit failed
Problem with irods-icommands
setup I guess? Again, this didn't happen just a while ago with 4.3.1..
Update: Fixed by explicitly adding irods-icommands
in dependencies to be installed.
Looking into the custom PAM module issue. /var/log/auth.log
says the following:
Oct 1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]: Traceback (most recent call last):
Oct 1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]: File "/usr/local/lib/pam-sodar/pam_sodar.py", line 8, in <module>
Oct 1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]: import requests
Oct 1 09:25:59 irods /usr/local/lib/pam-sodar/pam_sodar.py[1030]: ImportError: No module named requests
Oct 1 09:25:59 irods irodsPamAuthCheck[1030]: pam_unix(irods:auth): check pass; user unknown
Oct 1 09:25:59 irods irodsPamAuthCheck[1030]: pam_unix(irods:auth): authentication failure; logname= uid=1000 euid=0 tty= ruser= rhost=
Seems simple enough. However, adding pip3 install requests
in Dockerfile
does not help. I guess pam_python
runs its own (Python 2?) libraries or something? However, this did work in the 4.2 version of this image. Looking into it..
Custom PAM auth issues fixed, albeit with an ugly hack. I will add a separate issue for making it prettier.
There is a lot of internal demand for this so got to look into it.
Spec
Tasks
build.sh
Resources