ckan / ckanext-harvest

Remote harvesting extension for CKAN
130 stars 203 forks source link

[6787] CKAN 2.10 support (Part 2) #496

Closed TomeCirun closed 2 years ago

TomeCirun commented 2 years ago

#6787

TomeCirun commented 2 years ago

Hey @amercader, at this moment only 75 tests are failing :))) we need to see actually how many will fail after you merge #461 as I can see most of them are related to TypeError: str cannot be used as validator because it is not a user-defined function. I will try to fix the ones that will remain and make this PR ready to merge.

Thanks

TomeCirun commented 2 years ago

hey @amercader , something weird happens, i got: image when submitting a new harvest source.

What has been changed recently ?

amercader commented 2 years ago

@TomeCirun this means that we are trying to index a list or object field eg dataset['key'] = ['a', 'b'] or dataset['key'] = {'a', 'b'} instead of a string (eg a JSON dump like dataset["key"] = '["a", "b"]'). I got something similar in ckanext-scheming and it was caused by changes in how the validators transformed the data. Let me have a look and I'll come back to you

TomeCirun commented 2 years ago

@amercader how weird, it works with TomeCirun/ckan + TomeCirun/ckanext-harvest forks

image

i made another PR but the tests failed again... i think it has to do with this solr image

image
amercader commented 2 years ago

@TomeCirun yes, this will fail against Solr 8 (ckan/ckan-solr:2.10 image) and pass against Solr 6 (ckan/ckan-solr:2.9)

I pinned it down to the dataset_dict['status'] field, which is a virtual field with add overriding an action so it skips validation. Try adding this change to the before_dataset_index() method:

diff --git a/ckanext/harvest/plugin/__init__.py b/ckanext/harvest/plugin/__init__.py
index 673b301..0194b6f 100644
--- a/ckanext/harvest/plugin/__init__.py
+++ b/ckanext/harvest/plugin/__init__.py
@@ -169,6 +169,12 @@ class Harvest(MixinPlugin, p.SingletonPlugin, DefaultDatasetForm, DefaultTransla
             pkg_dict["data_dict"] = json.dumps(data_dict)
             pkg_dict["validated_data_dict"] = json.dumps(validated_data_dict)

+        if isinstance(pkg_dict.get('status'), dict):
+            try:
+                pkg_dict['status'] = json.dumps(pkg_dict['status'])
+            except ValueError:
+                pkg_dict.pop('status', None)
+
         return pkg_dict

     def after_dataset_show(self, context, data_dict):
TomeCirun commented 2 years ago

Yep, this is far better... ✋ (high five) 😃

Thanks, @amercader, I will try to fix the remaining ones by the end of the day (only 3 left)

btw, can you tell me how to run the tests locally?

amercader commented 2 years ago

btw, can you tell me how to run the tests locally?

Sure you just need to run pytest in your virtualenv. Make sure to have the dev-requirements.txt from ckan core and ckanext-harvest installed and then you can run:

pytest --ckan-ini=test.ini -v -s --disable-warnings ckanext/harvest/tests/
avdata99 commented 2 years ago

@TomeCirun I hope you can take a look here https://github.com/salsadigitalauorg/ckanext-harvest/pull/1