dulibrarytech / digitaldu-frontend

Digital Collections DU front end
Apache License 2.0
3 stars 0 forks source link

in page content search #81

Open kimpham54 opened 5 years ago

kimpham54 commented 5 years ago

in addition to a sitewide search, have an in page/in viewer search on content. audio/video - search transcripts image - OCR /HTR and search transcripts pdf - search PDF

jrynhart commented 2 years ago

Many pdf files contain no text/are scanned images. Can not implement a textual search. Need to have transcript text data present to enable a search of the pdf objects Need transcript data present to enable a search of image objects' text A/V: Kaltura transcript viewer contains a search box. This is currently enabled/available.

Required to implement "in page search" for images and pdf: transcript text (indexed)