ckan / ckanext-harvest

Remote harvesting extension for CKAN
130 stars 203 forks source link

Error on creating new harvest source #541

Open gatiszeiris opened 9 months ago

gatiszeiris commented 9 months ago

Hi!

We have CKAN 2.10.1

Installed harvest extension pip install -e git+https://github.com/ckan/ckanext-harvest.git#egg=ckanext-harvest (pyenv) $ cd /usr/lib/ckan/default/src/ckanext-harvest/ (pyenv) $ pip install -r requirements.txt (pyenv) $ ckan --config=/etc/ckan/default/ckan.ini db upgrade -p harvest

enabled in ckan.ini ckan.plugins = csvuploadvalid activity scheming_datasets csvmetadata datapusher datastore dcat harvest ckan_harvester dcat_rdf_harvester csw_harvester datatables_view recline_view image_view text_view barchart piechart customstats

ckan.harvest.mq.type = redis ckan.harvest.log_scope = 0

REDIS installed and available with no password.

Trying to add new source:

ckan --config=/etc/ckan/default/ckan.ini harvester source create geolatvija-test https://geo.gov.com/geonetwork/opendata/eng/csw csw zmni true zmni MANUAL

or via Web Interface

Getting always the same errors list ():

sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (builtins.RecursionError) maximum recursion depth exceeded while calling a Python object [SQL: INSERT INTO harvest_log (id, content, level, created) VALUES (%(id)s, %(content)s, %(level)s, %(created)s)] [parameters: [{'content': 'Harvest source not found for dataset 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9', 'level': 'ERROR'}]] (Background on this error at: https://sqlalche.me/e/14/7s2a) 2023-11-14 16:29:53,874 ERROR [ckanext.harvest.plugin] Harvest source not found for dataset 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9 2023-11-14 16:29:54,463 ERROR [ckan.lib.search] This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (builtins.RecursionError) maximum recursion depth exceeded while calling a Python object [SQL: INSERT INTO harvest_log (id, content, level, created) VALUES (%(id)s, %(content)s, %(level)s, %(created)s)] [parameters: [{'content': 'Harvest source not found for dataset 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9', 'level': 'ERROR'}]] (Background on this error at: https://sqlalche.me/e/14/7s2a) Traceback (most recent call last): File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/init.py", line 143, in dispatch_by_operation index.insert_dict(entity) File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/index.py", line 79, in insert_dict return self.update_dict(data) File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/index.py", line 106, in update_dict self.index_package(pkg_dict, defer_commit) File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/index.py", line 123, in index_package validated_pkg_dict, _errors = lib_plugins.plugin_validate( File "/usr/lib/ckan/default/src/ckan/ckan/lib/plugins.py", line 331, in plugin_validate return validate(data_dict, schema, context) File "/usr/lib/ckan/default/src/ckan/ckan/lib/navl/dictization_functions.py", line 305, in validate flat_data, errors = _validate(flattened, schema, validators_context) File "/usr/lib/ckan/default/src/ckan/ckan/lib/navl/dictization_functions.py", line 356, in _validate convert(converter, key, converted_data, errors, context) File "/usr/lib/ckan/default/src/ckan/ckan/lib/navl/dictization_functions.py", line 262, in convert value = converter(params) File "/usr/lib/ckan/default/src/ckan/ckan/logic/validators.py", line 195, in package_id_exists result = session.query(model.Package).get(value) File "", line 2, in get File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 402, in warned return fn(args, kwargs) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/query.py", line 947, in get return self._get_impl(ident, loading.load_on_pk_identity) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/query.py", line 951, in _get_impl return self.session._get_impl( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 2912, in _get_impl return db_load_fn( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/loading.py", line 530, in load_on_pk_identity session.execute( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 1711, in execute conn = self._connection_for_bind(bind) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 1552, in _connection_for_bind return self._transaction._connection_for_bind( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 721, in _connection_for_bind self._assert_active() File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 601, in _assert_active raise sa_exc.PendingRollbackError( sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (builtins.RecursionError) maximum recursion depth exceeded while calling a Python object [SQL: INSERT INTO harvest_log (id, content, level, created) VALUES (%(id)s, %(content)s, %(level)s, %(created)s)] [parameters: [{'content': 'Harvest source not found for dataset 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9', 'level': 'ERROR'}]] (Background on this error at: https://sqlalche.me/e/14/7s2a) 2023-11-14 16:29:54,464 ERROR [ckan.model.modification] This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (builtins.RecursionError) maximum recursion depth exceeded while calling a Python object [SQL: INSERT INTO harvest_log (id, content, level, created) VALUES (%(id)s, %(content)s, %(level)s, %(created)s)] [parameters: [{'content': 'Harvest source not found for dataset 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9', 'level': 'ERROR'}]] (Background on this error at: https://sqlalche.me/e/14/7s2a) Traceback (most recent call last): File "/usr/lib/ckan/default/src/ckan/ckan/model/modification.py", line 71, in notify observer.notify(entity, operation) File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/init.py", line 165, in notify dispatch_by_operation( File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/init.py", line 143, in dispatch_by_operation index.insert_dict(entity) File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/index.py", line 79, in insert_dict return self.update_dict(data) File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/index.py", line 106, in update_dict self.index_package(pkg_dict, defer_commit) File "/usr/lib/ckan/default/src/ckan/ckan/lib/search/index.py", line 123, in index_package validated_pkg_dict, _errors = lib_plugins.plugin_validate( File "/usr/lib/ckan/default/src/ckan/ckan/lib/plugins.py", line 331, in plugin_validate return validate(data_dict, schema, context) File "/usr/lib/ckan/default/src/ckan/ckan/lib/navl/dictization_functions.py", line 305, in validate flat_data, errors = _validate(flattened, schema, validators_context) File "/usr/lib/ckan/default/src/ckan/ckan/lib/navl/dictization_functions.py", line 356, in _validate convert(converter, key, converted_data, errors, context) File "/usr/lib/ckan/default/src/ckan/ckan/lib/navl/dictization_functions.py", line 262, in convert value = converter(params) File "/usr/lib/ckan/default/src/ckan/ckan/logic/validators.py", line 195, in package_id_exists result = session.query(model.Package).get(value) File "", line 2, in get File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 402, in warned return fn(args, kwargs) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/query.py", line 947, in get return self._get_impl(ident, loading.load_on_pk_identity) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/query.py", line 951, in _get_impl return self.session._get_impl( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 2912, in _get_impl return db_load_fn( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/loading.py", line 530, in load_on_pk_identity session.execute( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 1711, in execute conn = self._connection_for_bind(bind) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 1552, in _connection_for_bind return self._transaction._connection_for_bind( File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 721, in _connection_for_bind self._assert_active() File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 601, in _assert_active raise sa_exc.PendingRollbackError( sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (builtins.RecursionError) maximum recursion depth exceeded while calling a Python object [SQL: INSERT INTO harvest_log (id, content, level, created) VALUES (%(id)s, %(content)s, %(level)s, %(created)s)] [parameters: [{'content': 'Harvest source not found for dataset 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9', 'level': 'ERROR'}]] (Background on this error at: https://sqlalche.me/e/14/7s2a) 2023-11-14 16:29:53,790 INFO [ckanext.harvest.plugin] Creating harvest source: {'extras': {'active': True}, 'frequency': 'MANUAL', 'name': 'geolatvija-test', 'owner_org': '537dea7f-c55d-4607-8838-260bea1c7f1e', 'source_type': 'csw', 'title': 'zmni', 'type': 'harvest', 'url': 'https://geolatvija-test.vraa.gov.lv/geonetwork/opendata/eng/csw', 'extras': [{'key': 'frequency', 'value': 'MANUAL'}, {'key': 'source_type', 'value': 'csw'}], 'creator_user_id': '452a87b4-897b-4b62-8d8a-eae98964ce55', 'id': '8bbf3c49-0fac-4575-a0bb-d3e52b6996f9'} 2023-11-14 16:29:54,465 INFO [ckanext.harvest.plugin] Harvest source created: 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9 Traceback (most recent call last): File "/usr/lib/ckan/default/bin/ckan", line 8, in sys.exit(ckan()) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/click/core.py", line 1130, in call return self.main(args, kwargs) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/click/core.py", line 760, in invoke return __callback(args, kwargs) File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/cli.py", line 51, in create result = utils.create_harvest_source( File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/utils.py", line 139, in create_harvest_source source = tk.get_action("harvest_source_create")(context, data_dict) File "/usr/lib/ckan/default/src/ckan/ckan/logic/init.py", line 551, in wrapped result = _action(context, data_dict, kw) File "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/logic/action/create.py", line 72, in harvest_source_create source = toolkit.get_action('package_create')(context, data_dict) File "/usr/lib/ckan/default/src/ckan/ckan/logic/init__.py", line 551, in wrapped result = _action(context, data_dict, **kw) File "/usr/lib/ckan/default/src/ckan/ckan/logic/action/create.py", line 243, in package_create model.repo.commit() File "", line 2, in commit File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 1451, in commit self._transaction.commit(_to_root=self.future) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 827, in commit self._assert_active(prepared_ok=True) File "/usr/lib/ckan/default/lib64/python3.8/site-packages/sqlalchemy/orm/session.py", line 601, in _assert_active raise sa_exc.PendingRollbackError( sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (builtins.RecursionError) maximum recursion depth exceeded while calling a Python object [SQL: INSERT INTO harvest_log (id, content, level, created) VALUES (%(id)s, %(content)s, %(level)s, %(created)s)] [parameters: [{'content': 'Harvest source not found for dataset 8bbf3c49-0fac-4575-a0bb-d3e52b6996f9', 'level': 'ERROR'}]] (Background on this error at: https://sqlalche.me/e/14/7s2a)

frafra commented 9 months ago

Please, do not spam by mentioning random people. This is becoming a common practice on GitHub for some reason, and it is getting really annoying. Because of that, I refuse to give help this time. Respect should be reciprocal.

gatiszeiris commented 9 months ago

Sorry. I take it out.

vanquan223 commented 3 months ago

After I configured ckan.harvest.log_scope = 1, the harvest source was successfully created. And I noticed that the log line log.info('Creating harvest source: %r', data_dict) in def _create_harvest_source_object(context, data_dict) was calling after_dataset_show() before the harvest source creation was complete, which led to your error. image Hope this helps!

gatiszeiris commented 3 months ago

@vanquan223 Thanks for your input. I do the same to workaround issue ckan.harvest.log_scope = 1.