Open jbrown-xentity opened 3 years ago
The Failing test PR (#434 ) is a failing test related to this issue. I would expect this test to pass.
Still not sure if the right approach is to edit the after_show
function, or to let before_view
actually run. Not sure why before_run
is not running on CKAN >2.1, but probably some missing context...
I made updates to the test so that it would pass for our suggested fix; remove the dataset after_show functionality of removal and adding data, and let the "for_view" flag call the correct function and insert the data if it really is only for viewing. Would love some feedback on this; still think there is missing context on why the before_view function has been ignored since CKAN2.2...
Sorry @jbrown-xentity, this is probably way more confusing that it should be. Let's step back a second. I believe that the goal here is to have harvest_object_id
, harvest_source_id
, and harvest_source_title
indexed (ie in Solr). This will include the fields wherever the dataset dict is requested with package_show
(API, templates etc).
To achieve this we would only need to add them to the dict in the before_index()
hook so they get indexed by Solr, and we could remove this logic from after_show
, before_show
and before_view
.
Would you mind trying this simpler approach and see if it works as expected?
The ckanext-harvest extension has an
after_show
function that first removes and then inserts the harvest valuesharvest_object_id
,harvest_source_id
, andharvest_source_title
. This is also implemented in thebefore_view
function (only adding the harvest info). In implementation, this information never comes through because of this if statement. Thiscontext.get('validate')
always returns None, and thisif
fails.I can find no documentation on this
validate
key insidecontext
, but also cannot find any usage/implementation of it in CKAN other than in this code base. Is it possible this has been deprecated? Should this be changed to reflect similar logic to thebefore_show
?I know that CKAN suggests to call package_show before calling package_update, but the base harvester does not implement this, and many of data.gov's harvester extensions do not either...
I have tested that changing the if statement to be
if harvest_object and not context.get('validate', False):
inserts the information correctly. Would this be the right fix? Should I submit a PR?