Mc-Zen / tidy

A documentation generator for Typst in Typst.
https://typst.app/universe/package/tidy
MIT License
49 stars 2 forks source link

Parsing fails for specific order of functions #10

Closed jneug closed 10 months ago

jneug commented 10 months ago

I encountered an issue where tidy fails with an error in a specific document case.

The module is called qrutil.typ and looks like this:


// aliases
#let mod = calc.rem
#let mod2(x) = calc.rem(x, 2)
#let mod3(x) = calc.rem(x, 3)
#let mod255(x) = calc.rem(x, 255)
#let mod256(x) = calc.rem(x, 256)
#let mod285(x) = calc.rem(x, 285)

/// >>> qrutil.check-version(1)
/// >>> qrutil.check-version(30)
/// >>> qrutil.check-version(40)
/// >>> not qrutil.check-version(0)
/// >>> not qrutil.check-version(41)
#let check-version(version) = version >= 1 and version <= 40

/// >>> qrutil.check-ecl("l")
/// >>> qrutil.check-ecl("h")
/// >>> qrutil.check-ecl("m")
/// >>> qrutil.check-ecl("q")
/// >>> not qrutil.check-ecl("a")
/// >>> not qrutil.check-ecl("Q")
#let check-ecl(ecl) = ecl in ("l", "m", "q", "h")

/// >>> qrutil.size(1) == 21
/// >>> qrutil.size(2) == 25
/// >>> qrutil.size(33) == 149
/// >>> qrutil.size(40) == 177
#let size(version) = { return 21 + (version - 1)*4 }

// =================================
//  Encoding
// =================================
/// >>> qrutil.best-mode("0123") == 0
/// >>> qrutil.best-mode("0000") == 0
/// >>> qrutil.best-mode("1") == 0
/// >>> qrutil.best-mode("A") == 1
/// >>> qrutil.best-mode("ABCD") == 1
/// >>> qrutil.best-mode("ABCD:XYZ$") == 1
/// >>> qrutil.best-mode("a") == 2
/// >>> qrutil.best-mode("abcxyz") == 2
/// >>> qrutil.best-mode("ABCD:XYZ!") == 2
/// >>> qrutil.best-mode("@€") == none
#let best-mode(data) = {
  let nums = regex(`^\d*$`.text)
  let alphnum = regex(`^[\dA-Z $%*+\-./:]*$`.text)
  let byte = regex(`^[\x00-\xff]*$`.text)
  if data.match(nums) != none {
    return 0
  }
  if data.match(alphnum) != none {
    return 1
  }
  if data.match(byte) != none {
    return 2
  }
  return none
}

/// foo
#let mode-bits(mode) = {
  return (
    (false, false, false, true),
    (false, false, true, false),
    (false, true, false, false),
    (true, false, false, false)
  ).at(mode)
}

The minimal manual code:

#import "@local/tidy:0.1.0"

#import "qrutil.typ"

#let doc = tidy.parse-module(
  read("qrutil.typ"),
  name: "qrutil",
  scope: (
    qrutil: qrutil
  )
)
#tidy.show-module(doc)

The error I get:

error: string index 1303 is not a character boundary
    ┌─ @local/tidy:0.1.0/src/tidy-parse.typ:219:7
    │
219 │     if string.at(i) == char { count += 1}
    │        ^^^^^^^^^^^^

help: error occurred in this call of function `count-occurences`
    ┌─ @local/tidy:0.1.0/src/tidy-parse.typ:254:26
    │
254 │   let first-line-number = count-occurences(source-code, "\n", end: match.start) + 1
    │                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

help: error occurred in this call of function `parse-function-docstring`
   ┌─ @local/tidy:0.1.0/src/tidy.typ:57:23
   │
57 │     function-docs.push(tidy-parse.parse-function-docstring(content, match, parse-info))
   │                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

help: error occurred in this call of function `parse-module`
   ┌─ /manual.typ:5:11
   │  
 5 │   #let doc = tidy.parse-module(
   │ ╭────────────^
 6 │ │   read("qrutil.typ"),
 7 │ │   name: "qrutil",
 8 │ │   scope: (
 9 │ │     qrutil: qrutil
10 │ │   )
11 │ │ )
   │ ╰─^
jneug commented 10 months ago

It seems count-occurences fails because there is a symbol in the source code. If any docstring occurs after that comment, an error is thrown.

(I#M using the latest development version from the repository.)

Mc-Zen commented 10 months ago

Ah, thanks. The issue is with the string encoding. With unicode some characters may be wider than others and the function for counting the line breaks ignores that. Should be an easy fix