Jermolene / TiddlyWiki5

A self-contained JavaScript wiki for the browser, Node.js, AWS Lambda etc.
https://tiddlywiki.com/
Other
7.8k stars 1.16k forks source link

Introduce vocabulary parameter to wikitext content type #345

Open Jermolene opened 10 years ago

Jermolene commented 10 years ago

To aid in the interchange of content, it is proposed that we introduce a parameter to the TiddlyWiki5 content type (text/vnd.tiddlywiki) to specify the vocabulary with which the text is intended to be parsed. For example:

A vocabulary specified via a URL:

text/vnd.tiddlywiki; vocab=http://vocabs.tiddlywiki.com/standard.json

A vocabulary defined in a local tiddler:

text/vnd.tiddlywiki; vocab=myBirdwatchingVocabulary

In either case, the vocabulary contains information specifying:

It might be useful for vocabularies to be able to cascade, so that one could describe a custom vocab as, say, "standard TW markup plus these three plugins".

davidjade commented 10 years ago

I just want to comment on the "cascade" part. I think I would find this useful for rules in general (maybe it is already possible?). I've actually been wondering how feasible it would be to extend the existing rules rather than replace through plug-ins. For instance, I have one Wiki where I want any all capitalized words to be auto links in addition to the standard rules. I've also thought about having a list of "magic" words that would always get auto links, etc...

Jermolene commented 10 years ago

@davidjade yup, this proposal would meet your needs. The regexp for wikilink matching could be one of the vocab configuration parameters.

davidjade commented 10 years ago

So this would be set per-tiddler though? Seems like maybe I'd still need a plug-in approach if I wanted to extend the wikilink rules for all tiddlers (new and existing)?

Jermolene commented 10 years ago

The vocabulary setting would be part of the "type" field of each tiddler. For example:

text/vnd.tiddlywiki; vocab=http://vocabs.tiddlywiki.com/standard

There would also be a way of setting the default vocabulary for new tiddlers.

If you wanted to change the wikitext rules applied to a tiddler overriding the vocabulary specified within the type, then, yes, maybe that would need a plugin.

The primary motivation for all of this is to enable interoperability of content.

buggyj commented 10 years ago

I have Written a simple framework, based on cascading settings (from our discussion). Only the overall structure is defined, along with the methods to combine a cascade of these structures, the result being passed to the parser. A collection of setting are defined using a json tid, (Stid), which contains the sturucture "parserrules":{...}, within which lists(as 1-d arrays) of strings, and atomic types are allowed. Also within the stid is the item "baseparser":"..." which references the previous (if any) Stid in the cascade. Thus the cascade is defined recursively in reverse. An Stid is only referenced as part of a block (or tiddler) type, eg "baseparser":"text/vnd.tiddlywiki<fullTW5", here fullTW5 is the Stid. The outcome of the cascade is that lists, with the same name, that appear within differnet "parserrules" are merged, atomic items of the same type are overridden, and this resultant parserrules is passed to the last named parser, for example if a tiddler is of type "text/vnd.tiddlywiki<fullTW5" then the result of the cascade is passed to the text/vnd.tiddlywiki parser. Note that it is up to the author of the parser to decide what the items within "parserrules" are called and what they mean.

In addition I have extended the framework to include preparsers with the inclusion of "preparser":"..." within the Stid. It has the form "preparser":"text/xtext/y" where B is an (optional) Stid, that modifies the (pre)parser text/x and text/y is the form of the output of the preparser that is pass thru to the next parser in the cascade.

Some examples are demoed here: http://tw5vocab.tiddlyspot.com/ code is here: https://github.com/buggyj/TiddlyWiki5/tree/remotes/origin/vocabs

pmario commented 10 years ago

I have Written a simple framework, based on cascading settings (from our discussion). Only the overall structure is defined, along with the methods to combine a cascade of these structures, the result being passed to the parser.

very interesting

pmario commented 10 years ago

@buggyj I think there is a problem with your type definitions. eg: text/vnd.tiddlywiki<fullTW5 They don't match the rules defined at: http://tools.ietf.org/html/rfc6838#section-4.2

@Jermolene I think also the semicolon ; and spaces are not allowed

Jermolene commented 10 years ago

@buggyj many thanks, looks very interesting. I'll study it and provide feedback.

@pmario the parameter syntax I'm suggesting is discussed here:

http://tools.ietf.org/html/rfc6838#section-4.3

I found this StackOverflow article which gives some examples:

http://stackoverflow.com/questions/3051048/mime-rfc-content-type-parameter-confusion-unclear-rfc-specification

Jermolene commented 10 years ago

@buggyj I wonder if it's worth making this into a pull request so that we can use the line commenting feature to discuss the implementation?

Jermolene commented 10 years ago

@buggyj there are some issues with coding styles (eg TW5 always uses braces with the if statement). I've started trying to record the house style here:

http://tiddlywiki.com/static/TiddlyWiki%2520Coding%2520Style%2520Guidelines.html

pmario commented 10 years ago

the parameter syntax I'm suggesting is discussed here: http://tools.ietf.org/html/rfc6838#section-4.3

yes it says:

Parameter names have the syntax as media type names and values:

       parameter-name = restricted-name

and restricted-name is defined at: Naming Requirements: http://tools.ietf.org/html/rfc6838#section-4.2 where is no "space" ... but the semicolon seems to be ok.

http://www.iana.org/assignments/media-types/media-types.xhtml which links to some mime types with parameters http://www.iana.org/assignments/media-types-parameters/media-types-parameters.xhtml

so imo this will be ok.

text/vnd.tiddlywiki;parameter=value

but not recomended as seen here: http://tools.ietf.org/html/rfc6838#section-4.3

New parameters SHOULD NOT be defined as a way to introduce new functionality in types registered in the standards tree, although new parameters MAY be added to convey additional information that does not otherwise change existing functionality. An example of this would be a "revision" parameter to indicate a revision level of an external specification such as JPEG. Similar behavior is encouraged for media types registered in the vendor or personal trees, but is not required.


The StackOverflow article referes to http://tools.ietf.org/html/rfc2045 which is the specification for HTTP header types. ...

Where http://tools.ietf.org/html/rfc6838#section-4.3 says: ``

Note that this syntax is somewhat more restrictive than what is allowed by the ABNF in [RFC2045] and amended by [RFC2231].


So may be it would be best to have a mime subtype that tells a user, to have a look at special tiddler fields. eg: vocab=....

text/vnd.tiddlywiki
text/vnd.tiddlywiki.vocab
pmario commented 10 years ago

I did find a text vnd format: vnd.fmi.flexstor that uses optional parameters: http://www.iana.org/assignments/media-types/text/vnd.fmi.flexstor

so for me it seems:

text/vnd.tiddlywiki;vocab=http://vocabs.tiddlywiki.com/standard

would be possible but the syntax of the parameter has to be specified. since http://tools.ietf.org/html/rfc6838#section-4.3 uses a MUST for the parameter specification. ... So we need to restrict the tiddler names, that can contain the vocab definitions. Otherwise it may be complicated.

There is no defined syntax for parameter values. Therefore, registrations MUST specify parameter value syntax. Additionally, some transports impose restrictions on parameter value syntax, so care needs be taken to limit the use of potentially problematic syntaxes; e.g., pure binary valued parameters, while permitted in some protocols, are best avoided.

buggyj commented 10 years ago

@Jermolene The code is only to 'demo' quality. I was expecting some feedback about the general logic, are there issues, eg the format of the vocab parameter - the answer to this one seems to be yes :-) If your happy with the scope I will tidy up the code and see if there are corner cases to be dealt with before submitting a pull request. (then you can tell me what's wrong with it) Alternatively, if you would like to use the code to talk about issues I can submit a pull request with the code as it is.

Jermolene commented 10 years ago

Hi @buggyj no problem about coding standards, just wanted to make sure you were aware.

I'd like to be able to discuss the code before you get stuck into lots of work on it, so I'd be happy for you to make a pull request in it's present form, and then we can bash it into shape.

Many thanks!

buggyj commented 10 years ago

@Jermolene fair enough!

buggyj commented 10 years ago

@Jermolene Hello, I did the pull request some time ago, I hope it go to you? https://github.com/Jermolene/TiddlyWiki5/pull/382

Jermolene commented 10 years ago

My apologies @buggyj I've been fighting a ridiculous backlog for a few days now. I've made a few comments on the pull request; I'm not sure that I understand all of the changes properly.

My conclusion from a brief review is that in order to sort this issue out we need to do a more thorough refactoring of the parser architecture. There's a few problems I'd like to sort out in one go:

  • Making the parsers and parse rules re-entrant
  • Changing the way that transcluding images works so that we can more naturally apply dimensions to the <img> tag (currently the <img> tag is generated by the parser)
  • Reusing more of the common parser infrastructure
  • Fixing up the generation of <p> tags and treatment of line breaks after HTML elements

Anyhow, there's some other stuff on the roadmap I'd like to get done first (eg the current work on autosave and translation), and then return to this in a few weeks.

buggyj commented 10 years ago

simplified and resubmitted as pull request #629 new demo is at http://tw5vocab.tiddlyspot.com/

pmario commented 10 years ago

@buggyj great stuff

Telumire commented 1 year ago

I like that idea a lot. Maybe instead of "vocab", this could be called "xmlns" or namespace, to follow the html spec ?

Jermolene commented 1 year ago

I like that idea a lot. Maybe instead of "vocab", this could be called "xmlns" or namespace, to follow the html spec ?

"namespace" might work, but seems a little technical. I think of these vocabularies as being very visible for end users. The dream would be that we'd evolve vocabularies for specific niches (eg a vocabulary for dance choregraphers that incorporates a notation for dance moves, or a special vocabulary for writing D&D games).