ckan / ckanext-harvest

Remote harvesting extension for CKAN
130 stars 203 forks source link

Fixes unicode error in Python 2 #508

Closed seitenbau-govdata closed 1 year ago

seitenbau-govdata commented 2 years ago

Fixes https://github.com/ckan/ckanext-harvest/issues/502

seitenbau-govdata commented 2 years ago

@amercader I haven't found any workable solution to create a test for /dataset?tags=abcčd, so maybe it would be ok without a test?

BTW: The tests for CKAN 2.10 are broken on the master as well. So I think it has nothing to do with the changes in the pull request.

amercader commented 2 years ago

@seitenbau-govdata this should do the trick:

diff --git a/ckanext/harvest/tests/test_blueprint.py b/ckanext/harvest/tests/test_blueprint.py
index dbbca7d..a6bef99 100644
--- a/ckanext/harvest/tests/test_blueprint.py
+++ b/ckanext/harvest/tests/test_blueprint.py
@@ -1,3 +1,4 @@
+# coding: utf-8
 import six
 import pytest

@@ -126,3 +127,15 @@ class TestBlueprint():
         response = app.get(url, extra_environ=env)

         _assert_in_body(job['id'], response)
+
+    def test_search_dataset_special_characters(self, app):
+
+        value = u"abcčd"
+
+        dataset = factories.Dataset(tags=[{"name": value}])
+
+        url = u"/dataset?tags=" + value
+
+        response = app.get(url)
+
+        _assert_in_body(dataset["title"], response)

Note I didn't use url_for because there was another bug in CKAN core with unicode handling but went too down the rabbit hole.

The 2.10 failures should be fixed now

seitenbau-govdata commented 2 years ago

@amercader I have tried the quite similar test.

def test_package_search_interface_before_search_page_is_rendered(self, app):

    dataset1 = factories.Dataset(title=u'Fïrst dataset')

    response = app.get(u'/dataset?title=Fïrst')

    _assert_in_body(dataset1['title'], response)

But it hasn't worked with CKAN < 2.9 because of a unicode error in app.get(u'/dataset?title=Fïrst').

And I was getting into the rabbit whole with using url_for. 😃 I get it to work with a custom url_for implementation (e.g. as in ckanext-dcat) that switches to the old style if it is CKAN < 2.9. But then the test didn't fail without the fix as I would expect. Probably because there is some other unicode encoding handling in the url_for core code.

seitenbau-govdata commented 2 years ago

@amercader This is the unicode error in the test I got for CKAN < 2.9. It looks like the test app cannot handle the unicode value for the parameter when calling app.get(url).

amercader commented 2 years ago

Let's remove the test then and not waste more time, the change is straightforward enough

seitenbau-govdata commented 2 years ago

@amercader I have removed the test now.