Unable to extract document metadata

fletcher / peg-multimarkdown

An implementation of MultiMarkdown in C, using a PEG grammar - a fork of jgm's peg-markdown. No longer under active development - see MMD 5.

Other

525 stars 55 forks source link

Unable to extract document metadata #131

Closed konobi closed 12 years ago

konobi commented 12 years ago

Currently there doesn't seem to be a way to extract all metadata from a document in something machine parseable like JSON.

Having this functionality would be very useful for automated tooling.

fletcher commented 12 years ago

So..... you want me to write a machine program to parse the existing metadata into something machine-parseable....? ;)

The current metadata format is machine parseable as is (otherwise it wouldn't be of very much use to me...). I have no use for JSON at this time.

konobi commented 12 years ago

No, just a way to use the script to get just the metadata out of the markdown file rather than process the entire thing just to pull one small block out.

fletcher commented 12 years ago

The metadata, if present, is everything up until the first "\n\n". Whatever tool you are using to do something with the metadata should be able to manage that for you.

If you're looking to write your own tool to manage metadata separately, there are a few edge cases (e.g. URLs) that are managed separately to minimize inappropriately treating something as metadata that probably wasn't intended to be. If you look at markdown_parser.leg you see the actual definition of metadata if you want to process it on your own and match the default behavior exactly.