TAMULib / DSpace

(Official) The DSpace digital asset management system that powers your Institutional Repository
https://wiki.duraspace.org/display/DSDOC4x/Introduction
BSD 3-Clause "New" or "Revised" License
5 stars 0 forks source link

[DSpace 7 Upgrade] Reimplement only index text bitstreams that are not restricted #204

Open wwelling opened 8 months ago

wwelling commented 8 months ago

This is required due to the fact that the full text search results reveal partial text content of these restricted items.

Code is at https://github.com/TAMULib/DSpace/blob/tamu-dspace-6.3/dspace/modules/additions/src/main/java/org/dspace/discovery/FullTextContentStreams.java#L52.

  1. Copy latest java file from tamu-dspace-7_x dspace-api to modules additions directory matching path dspace/modules/additions/src/main/java/org/dspace/discovery/FullTextContentStreams.java..
  2. Apply TAMU customizations to java file.
wwelling commented 7 months ago

Blocked by PDF restriction for text extraction. ImageMagix XML configuration requires update to bypass restricted PDF files.