TaTo30 / vue-pdf

PDF component for Vue 3
https://tato30.github.io/vue-pdf/
MIT License
473 stars 66 forks source link

Can't find and highlight words hyphenated across lines #125

Closed nataliia-obraztsova closed 4 months ago

nataliia-obraztsova commented 4 months ago

Problem description

Words, that are hyphenated across lines are not matched by searchQuery. searchQuery function doesn't remove hyphens in case textItem.hasEOL. Following replacement of "\n" with a white space also adds a white space in the middle of a word.

Example 1: phrase until a hyphen is matched

const highlightText = ref('Dynamic languages such as JavaScript are more difficult to com') image

Example 2: phrase containing a word without a hyphen is no longer matched

const highlightText = ref('Dynamic languages such as JavaScript are more difficult to compile') image

Example 3: phrase containing a word with a hyphen and a white space is matched again

const highlightText = ref('Dynamic languages such as JavaScript are more difficult to com- pile') image

Expected behavior

String 'Dynamic languages such as JavaScript are more difficult to compile' is highlighted.

Possible solution

A possible solution would be to remove line break hyphens in the middle of words when joining lines together. And to refrain from adding a white space between this line and the next line.

Code to reproduce

<template>
    <VuePDF :pdf="pdf" text-layer :highlight-text="highlightText" :highlight-options="highlightOptions"/>

</template>

<script setup lang="ts">
import { VuePDF, usePDF } from '@tato30/vue-pdf'
import { ref } from 'vue'
import '@tato30/vue-pdf/style.css';

const pdfLink = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/web/compressed.tracemonkey-pldi-09.pdf'
const { pdf } = usePDF(pdfLink)

const highlightText = ref('Dynamic languages such as JavaScript are more difficult to com- pile')
const highlightOptions = ref({
  completeWords: false,
  ignoreCase: true,
})
</script>

<style scoped>
</style>

Configuration: