cozydev-pink / protosearch

prototype search library in pure scala
https://cozydev-pink.github.io/protosearch/
Apache License 2.0
9 stars 5 forks source link

Upgrade to laika 1.0.0 #140

Closed valencik closed 9 months ago

valencik commented 10 months ago

This PR upgrades to laika 1.0.0

The two important pieces are:

Plaintext runs on every document. IndexFormat renders a tree of documents that have been formatted with Plaintext

This follows the approach outlined in https://github.com/cozydev-pink/protosearch/issues/102#issuecomment-1793738954

The DocsDirectory file is upgraded, but this really is not the intended way to use things going forward.

jenshalm commented 10 months ago

Just quickly want to confirm that the post processor skeleton looks okay in my eyes.

valencik commented 9 months ago

To test this I published the laikaIO project as a sbt plugin (adding sbtPlugin := true to it's settings and running laikaIOJVM/publishLocal on scala 2.12.18) and then created a custom task in the docs project like this:

Some preamble in the build.sbt:

import cats.effect.unsafe.implicits.global
import pink.cozydev.protosearch.analysis.DocsDirectory
import laika.io.model.FilePath

lazy val indexTask = taskKey[String]("Generates the index output")

And then the task:

    indexTask := {
      val userConfig = laikaConfig.value
      val targetDir = (laikaAST / target).value
      val parser = laika.sbt.Settings.parser.value
      val tree = parser.use(_.fromInput(laikaInputs.value.delegate).parse).unsafeRunSync()
      DocsDirectory.plaintextRenderer.use(
        _.from(tree)
         .toDirectory(FilePath.fromJavaFile(targetDir))(userConfig.encoding)
         .render
      ).unsafeRunSync()
      println(s"rendered to ${targetDir}")
      root.toString()
    }

This rendered the Plaintext formatted docs to /home/andrew/src/github.com/cozydev/protosearch/site/target/docs/ast and they looked as expected (which is actually a rather silly format currently, but improving that is not the goal here).

So I am happy to say that I think we are on a good path here. Up next, in the immediate future, I'd like to turn this task into a quick sbt plugin. With the goal being that it will be much easier to use in another project. Specifically I'd like to be able to modify the http4s build with just a new plugin dependency and maybe one or two lines, and then have it generate plaintext files like above.

And then I can get back to actually making the plaintext formatting reasonable, and the IndexFormat actually building an index.

valencik commented 9 months ago

Here's the commit of the hacked build testing this out: https://github.com/cozydev-pink/protosearch/commit/c84d1410f9fb70d86f7c2a9e787eebcc39ac5003

valencik commented 9 months ago

It turns out I didn't need to publish laikaIO as a sbt plugin, we can use it as a regular dependency as in https://github.com/cozydev-pink/protosearch/compare/laika-v1...laika-v1-snapshot?expand=1

Thank you @armanbilge suggesting this :)

valencik commented 9 months ago

Merging this and continuing the work in https://github.com/cozydev-pink/protosearch/pull/148