evilstreak / markdown-js

A Markdown parser for javascript
7.7k stars 863 forks source link

Add a 'meta' property to return values to give access to MMD's metadata #76

Closed rtpg closed 11 years ago

rtpg commented 11 years ago

I've added a 'meta' property to all the return values to functions in the exposed properties. This collects information that is not used for the rendering of the markdown but might be useful to the user, such as Multimarkdown's metadata.

In MMD, metadata is represented in the source as several lines like the following:

Author : some guy Title : super title

These key-value pairs aren't used for the HTML rendering, so were not represented in the return values of the exposed function. Now the k-v pairs are given in a JSON object attached to the return string/tree.

The JSON object which stores these attributes also contain metadata on references, which is used for the internal parsing. having "reference" as a key in the metadata might break the parsing (not tested yet).

I also wrapped some of the return values in a String object, to allow for the metadata to be attached (you can't attach attributes to primitives in Javascript). The only side-effect that I could find is that it breaks classic string equality testing (String('foo')!='foo'). there is a compare function(localCompare) that can be used however.

ashb commented 11 years ago

Hmmm I'm not sure about doing this in this exact fashion.

  1. does MMD style metadata already get parsed?

    I couldn't get it working with a quick test. Can you show me me a full example? (And use the backtick guard of Github markdown so it shows up as pre)

  2. I'm tempted to say that if you want access to this metadata then you should just call json = parse(string) and the look at it from there, then toHTML(json) once you've done what you need with the metadata.

Thoughts?

rtpg commented 11 years ago

Might not have thought this through fully. I mainly tested this through goofing around in node...

I stuck the misc_sw.text (from the multimarkdown fixtures) example into the lib file. Here's part of the text:

Subject: Software not painful to use
Subject_short: painless software
Topic: /misc/coolsw
Archive: no
Date: Nov 20 2006
Order: -9.5
inMenu: true

### General ###

* *Operating System* : [Mac OS X][switch]: heaven, after the purgatory of Linux 
  and the hell of Windows.

The top part is the metadata...

Here's the script I ran with node to test :

fs=require('fs');
md=require('./markdown.js');
fs.readFile('./misc_sw.text',function(err,str){
    console.log(md.toHTML(str.toString(),'Maruku').meta.subject)});

I get as a response Software not painful to use. I looked at the entire meta object as well, and got most of the info.

The thing that bothers me the most is that in the meta object there's also a references proprerty. This seems to be used to deal with links :

{ firefox: { href: 'http://getfirefox.com/' },
  gmail: { href: 'http://gmail.com/' },
  bloglines: { href: 'http://bloglines.com/' },
  wikipedia: { href: 'http://en.wikipedia.org/' }
  /*etc */ 
 }

Haven't really looked enough at how that is used though.