ckan / ckanext-archiver

Archive CKAN resources
MIT License
21 stars 46 forks source link

Preparing CKAN 2.10 support #83

Closed avdata99 closed 2 years ago

avdata99 commented 2 years ago

Related to CKAN#6787

This PR is getting a SOLR error only for CKAN 2.10 (all other CKAN version works fine)

If I drop the archiver key from package before index it (something like del(package['archiver']) here or commenting this line)

There is no problem with the archiver key for resources, only for the main package.

The error is

ERROR [ckanext.archiver.tasks] 
  Error occurred during archiving package: 
  Solr returned an error: 
    Solr responded with an error (HTTP 400): 
      [Reason: Error:[doc=4a7789fdbe0947073fe362f407d22316]  
       Unknown operation for the an atomic update: reason]

Where reason is a field inside the archiver key in the pkg_dict If I remove the reason key from the dict, I get the same error in other archiver field.

The problem is only for dataset, not for resources

Is there any consideration regarding the update to SOLR 8?

Solution

Stop using the archiver key at the root level on packages (changed here)

amercader commented 2 years ago

@avdata99 This Solr error means that we are trying to index a list or object field eg dataset['key'] = ['a', 'b'] or dataset['key'] = {'a', 'b'} instead of a string (eg a JSON dump like dataset["key"] = '["a", "b"]'). Solr 8 is less forgiving than Solr 6 so that's why we are getting this error now. It normally is a symptom of an underlying cause. Having found it on other extensions while upgrading to 2.10 this can be caused, for instance by:

I think removing the archiver key entirely is a bit drastic as this is exposed via the API and users might relying on it. I would keep it as is on after_dataset_search() but pop it from the dict on before_dataset_index() so it doesn't mess with the indexing

avdata99 commented 2 years ago

@amercader I drop the archiver key before indexing here Also, added a test for the API endpoint to ensure we have the archiver key here

The tests are still green