infotexture / dita-bootstrap.lunr

DITA Open Toolkit plug-in that extends HTML output from the DITA Bootstrap plug-in with a Lunr.js search box
https://infotexture.github.io/dita-bootstrap/search-box.html
Apache License 2.0
0 stars 4 forks source link

How remove the Feature.REGEX_BACKTRACKING_LIMIT #11

Closed tboenig closed 1 year ago

tboenig commented 1 year ago

Hello,

I would very much like to use the lunr solution. But I get the following error message:

[INFO] Building project .github/dita-ot/html.xml
     [xslt] Regex backtracking limit exceeded processing ^[....]*([....]+?)[....]*$\Z. Simplify the regular expression, or set Feature.REGEX_BACKTRACKING_LIMIT to -1 to remove this limit.
     [xslt] Failed to transform document: Failed to transform document: Regex backtracking limit exceeded processing ^[....]*([....]+?)[....]*$\Z. Simplify the regular expression, or set Feature.REGEX_BACKTRACKING_LIMIT to -1 to remove this limit.

An index is also not built see: https://github.com/tboenig/t_guidelines/blob/gh-pages/html/js/lunr-client.js

Here my project/repo: https://github.com/tboenig/t_guidelines

I have already tried several things.

  1. I assume there is a quantity problem (regex..)
  2. Somewhere a parameter is not set correctly.

    Thank you.

infotexture commented 1 year ago

Danke Matthias, glad to hear you found this and are interested in using it.

We haven't published an official release yet as we're still polishing a few rough edges, but user feedback is the best way to improve something like this, so thanks for taking the time to report this issue.

I haven't seen this error so far, so I'm wondering if there's something peculiar to your project that triggers this. 🤔

tboenig commented 1 year ago

Thank you very much for the feedback. I was looking for specifics. In the process, I may have found one:

<chapter href="trans/structur_gt.dita">
        <topicref href="structur_gt.ditamap" format="ditamap"/>
</chapter>

I add another ditamap to the ditamap. To the best of my understanding, this is DITA conforming. I have now removed this feature. The warning "...Feature.REGEX_BACKTRACKING_LIMIT to -1..." is now no longer reported. But my chapter is missing. :slightly_frowning_face:

The search index is still not build.

When using: <publication transtype="html5"> Also does not abort processing a ditamap with a ditamap recursion.

Thanks again.

jason-fox commented 1 year ago

It is always good to get another test case document set. The problematic regex in this case was the following line:

       <xsl:element name="text">
-        <xsl:value-of  select="replace(replace($TEXT , '^\s*(.+?)\s*$', '$1'), '^ .*$', '')" />
+        <xsl:value-of select="normalize-space($TEXT)"/>
       </xsl:element>

There were a couple of extra issues regarding stemming and stop words, but I've included a fix for those as well.

jason-fox commented 1 year ago

If you want to run the patch before it lands on develop, amend your action as follows:

        plugins: |
          https://github.com/jason-fox/fox.jason.extend.css/archive/master.zip
          https://github.com/infotexture/dita-bootstrap/archive/master.zip
+         https://github.com/jason-fox/dita-bootstrap.lunr/archive/feature/offline.zip
          https://github.com/jason-fox/fox.jason.prismjs/archive/master.zip
          fox.jason.favicon
tboenig commented 1 year ago

Well I have changed the action workflow.

I received the following error message:

Error: Plug-in net.infotexture.dita-bootstrap.lunr uses an undefined extension point bootstrap.process.pre

Thanks also for your help.

jason-fox commented 1 year ago

You'll need https://github.com/infotexture/dita-bootstrap/archive/develop.zip then.

tboenig commented 1 year ago

puh,

the action workflow is running again.

jason-fox commented 1 year ago

What do you get if you run locally:

./dita -f html5-bootstrap -i ../../t_guidelines/de/ocrd_ocrd.ditamap --default.language=de --args.hdr=../plugins/dita-bootstrap.lunr/includes/bs-navbar-lunr.hdr.xml
jason-fox commented 1 year ago

If I manually copy over preview.json and search_index.json - it works properly https://jason-fox.github.io/t_guidelines/trans/ - this would seem to indicate a fault in the GitHub Action.

jason-fox commented 1 year ago

I think that the underlying issue with the index may be reduced to a combination of running the dita-ot/dita-ot-action and creating a document in not in English For German to work, I think that the DE-DE locale needs to be generated within the action:

locale-gen de_DE.UTF-8 
LANG="de_DE.UTF-8"  
LANGUAGE="de_DE:de"  
LC_ALL="de_DE.UTF-8"

dita -f html5-bootstrap -i ...etc

The following GitHub Action is building your index successfully in German: https://github.com/jason-fox/t_guidelines/blob/testing/.github/workflows/dita.yml -it is using the version found in PR #10 10