johnnydecimal / index-spec

A 'formal' specification for the index file. And any other data structures.
MIT License
20 stars 0 forks source link

schema.ts discussion #4

Open johnnydecimal opened 3 weeks ago

johnnydecimal commented 3 weeks ago

Ref. https://jdcm.al/22.00.0081.

hpfast commented 2 weeks ago

Hey there! excited to see these developments. I have a few thoughts which I'll throw in here in case they stick.

I see a Typescript schema under discussion, but the Readme is talking mostly about a plain text format. I gather that:

Markdown?

Have you considered using Markdown as the format for this, and having that be the primary source of truth? Markdown can be converted to good AST with the Unified ecosystem, and then converted to any JSON form you need, and then also serialized as Markdown again. If you're unfamiliar with this project, I think their documentation is terrible for newcomers, but I think they've got this figured out to the point where you can use Markdown as a structured document format.

It feels like it would be sufficient here. The three levels could be #, ##, ###. Then freeform comments (see #1) could be added at will with paragraph text, as could any links you want.

Remark provides an AST spec and good tooling for interacting with it. So you can scaffold on that to add document validation checks (that all headings have to have the right number format, for example) and any transformations you would desire, to various output formats. Since otherwise I presume you'll have to write your own parser for your plain text format, building on a tool like this might be helpful ... I do know that the system of plugins which operate like filters on the AST makes it easy for other (programmers) to contribute functionality and output formats.

I do know that one does get into edge cases eventually with Markdown. I just have the feeling that the unified project has got this pretty well figured out. If someone can point out otherwise, please do so.

Extensibility

A further thought relating to validation of the Jdex file. I have settled on a version of JD which uses categories and then time-based IDs under that. This comes from insights doing all my notekeeping on paper for a few weeks, which I hope to write about soon elsewhere. But my point is: I know it is heresy, but there are going to be people doing this, and I think I can see how it could be accomodated.

You could have several levels of 'strictness'. One non-negotiable might be the three levels of headings. Another non-negotionable one might be that each Heading starts with an (alpha)numeric identifier followed by a name. And the 'most strict' level might be 'Bona Fide JD Compliance' where IDs have to be in AC.ID format.

I can see how this would work technically, since such checks would be done by 'filters' operating on the nodes of the syntax tree, so you could apply different filters at will.

I guess where I'm going is: could the JD Spec be extensible in itself, defining the very basics, but allowing usage for transforming a diverse range of 'hierarchical classification trees' into other formats?

And, would using Markdown make it such that a non-programmer can edit the file and feel like they're working on a text document, but it remains compatible with the parser, and the parser can also validate their JDex file for them?