scientist-softserv / adventist_knapsack

Apache License 2.0
2 stars 0 forks source link

After split, Show pages' items list includes the filesets of child works #817

Open ShanaLMoore opened 4 days ago

ShanaLMoore commented 4 days ago

Summary

NOTE: This issue is caused by/related to this Hyrax bug

Per Katharine,

I tested with my own 6 page PDF.

https://adl.s2.adventistdigitallibrary.org/concern/generic_works/538a6bcf-e2b7-429e-b15a-f19530a9a7d9?locale=en

BUGS: The splitting worked, but we have double the number of child works showing in the items list. On Production we usually only see the child works with page numbers attached. However, on staging, there are a string of JPEGs in the item list.

All child works are showing in the UV, so the work is duplicated in the UV.

Acceptance Criteria

Screenshots or Video

STAGING

Image

PRODUCTION

Image

Testing Instructions

To be filled out by dev

Notes

related issue: https://github.com/samvera/hyrax/issues/6767

ShanaLMoore commented 3 days ago

Thoughts on an override to the indexers? Will it have negative consequences?

# hyku_indexing.rb and maybe pcdm indexer?
        solr_doc['member_ids_ssim'] = filter_member_ids(object.member_ids.map(&:to_s)) 
        solr_doc['file_set_ids_ssim'] = filter_fileset_ids(solr_doc['member_ids_ssim'])

private

  def filter_member_ids(member_ids)
    member_ids.reject do |id|
      member = Hyrax.query_service.find_by(id: Valkyrie::ID.new(id))
      member.is_a?(Hyrax::FileSet)
    end
  end

def filter_fileset_ids(member_ids)
  member_ids.select do |id|
    member = Hyrax.query_service.find_by(id: Valkyrie::ID.new(id))
    member.is_a?(Hyrax::FileSet)
  end
end

update 1: this didn't fully solve the problem. I still see the child filesets in the items section but I don't see them in the UV. I think it's because #list_of_item_ids_to_display also needs to filter them out for the view

![Image](https://github.com/user-attachments/assets/4e91fab8-1b85-48fd-bcbf-2f31ae80a9f7)

update 2:

# hyrax work_show_presenter.rb, ugly... but it works? 

# rejects Hyrax::FileSets unless it's a pdf. 

    def authorized_item_ids(filter_unreadable: Flipflop.hide_private_items?)
      @member_item_list_ids ||= begin
        ids = filter_unreadable ? ordered_ids.reject { |id| !current_ability.can?(:read, id) } : ordered_ids
        ids.reject do |id|
          member = Hyrax.query_service.find_by(id: id)
          if member.is_a?(Hyrax::FileSet)
            original_file = member.original_file
            original_file&.pdf? ? false : true
          else
            false
          end
        end
      end
    end

=> 🎉

Image

ShanaLMoore commented 20 hours ago

Per LaRita, koppie and dassie work as expected.

laritakr commented 19 hours ago

Appears to be due to IiifPrint's indexing that includes descendants for OCR purposes.