rasendubi / uniorg

An accurate Org-mode parser for JavaScript/TypeScript
https://oleksii.shmalko.com/uniorg
GNU General Public License v3.0
256 stars 25 forks source link

feat: org-cite support #33

Closed rasendubi closed 1 year ago

rasendubi commented 2 years ago

I have started working on org-cite support before the war began. I don't remember where I left but here are my stashed changes

codecov-commenter commented 2 years ago

Codecov Report

Merging #33 (9bafef6) into master (4b59881) will decrease coverage by 0.35%. The diff coverage is 92.50%.

:exclamation: Current head 9bafef6 differs from pull request most recent head 67420e7. Consider uploading reports for the commit 67420e7 to get more accurate results

@@            Coverage Diff             @@
##           master      #33      +/-   ##
==========================================
- Coverage   96.07%   95.72%   -0.35%     
==========================================
  Files          15       15              
  Lines        1657     1686      +29     
  Branches      554      539      -15     
==========================================
+ Hits         1592     1614      +22     
- Misses         64       71       +7     
  Partials        1        1              
Impacted Files Coverage Δ
packages/uniorg-parse/src/parser.ts 94.91% <91.42%> (-0.36%) :arrow_down:
packages/uniorg-parse/src/reader.ts 96.34% <100.00%> (+0.23%) :arrow_up:
packages/uniorg-parse/src/utils.ts 100.00% <100.00%> (ø)
packages/uniorg-stringify/src/stringify.ts 95.18% <0.00%> (-0.66%) :arrow_down:
packages/uniorg-rehype/src/org-to-hast.ts 97.58% <0.00%> (-0.27%) :arrow_down:
packages/uniorg-parse/src/parse-options.ts 100.00% <0.00%> (ø)

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

rasendubi commented 2 years ago

Ok, so the status is: everything is parsed except for prefixes and suffixes.

The reason is that prefixes and suffixes are “rich” (=RecursiveObject) and allow bold/italic/etc elements inside. It's easy to call parseElements when parsing a citation to parse prefix/suffix and assign these to fields. This results in the same structure as org-element:

[cite:prefix;@hello]

type: "citation"
prefix:
  - type: "text"
    value: "prefix"
suffix: null
children:
  - type: "citation-reference"
    key: "hello"

however, this structure is not unified-friendly. This is because default unified traversal utils only traverse children and not other properties. Therefore, such plugins (smartypants, link traversal) would miss prefix and suffix.

A better structure is to push prefix and suffix as children:

[cite:common prefix;prefix @hello suffix;common suffix]

type: "citation"
children:
  - type: "citation-prefix"
    children:
      - type: "text"
        value: "common prefix"
  - type: "citation-reference"
    key: "hello"
    children:
      - type: "citation-prefix"
        children:
          - type: "text"
            value: "prefix "
      - type: "citation-key"
        value: "hello"
      - type: "citation-suffix"
        children:
          - type: "text"
            value: " suffix"
  - type: "citation-suffix"
    children:
      - type: "text"
        value: "common prefix"

but this requires a somewhat different parsing approach

changeset-bot[bot] commented 1 year ago

🦋 Changeset detected

Latest commit: 67420e7fe05defc99b52aecce75fcc3831d39ff6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 8 packages | Name | Type | | ------------------------ | ----- | | uniorg-parse | Major | | uniorg-stringify | Minor | | uniorg-rehype | Minor | | uniorg | Minor | | example | Patch | | extract-keywords-example | Patch | | blog-starter | Patch | | org-braindump | Patch |

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR