Can I get line/column information from a node?

orklann commented 9 years ago

I need line and column information of a node, can I get it?

If not, is it hard to modify the code to do that?

orklann commented 9 years ago

I tried to hack this project and eventually I found the line/column info was lost at the level of generated parser. It seems that these meta data is just in the GREG structure, but I can not figure out how to get these in the parser, any idea?

fletcher commented 9 years ago

greg does not do a great job with capturing start/stop character offsets (it doesn't do line/column at all -- you would have to calculate that yourself), but it was infinitely better than peg-leg. I needed this information for MultiMarkdown Composer (http://multimarkdown.com/), and it required a lot of hacking to get the proper character offsets. This was even more complex when parsing nested structures (e.g. a blockquote inside a list item inside an ordered list inside an unordered list) -- that required a lot of work to figure out an accurate solution.

The code used inside MMD Composer is basically the exact same as MMD-4, just with some extra routines to properly calculate start/stop offsets. So it definitely can be done. If you need exact matching, inside arbitrarily complex structures, it's a fair amount of work (and as you can guess, I was really familiar with how the parser works. ;) If you only need to handle top-level structures (e.g. h1, h2, p and the like), it wouldn't be very hard, but you might run into problems with edge cases.

At least for the time being, however, the code that I wrote for Composer is proprietary, so I won't be open-sourcing the code I wrote to handle start/stop offsets. The basic idea, though, is that greg provides thunk information that includes begin/end character offsets. It's a bit of trial and error, however, to figure out exactly what those offsets refer to, and I frequently add to manually add/subtract to get the proper numbers for parts of the parser.

orklann commented 9 years ago

WOW, I am planning a Markdown Editor for Mac, and as the same, I need the line number of each node to make synchronized scrolling between editor and preview view. Well, I am grad to know that you've finished the same thing with GREG, maybe I need to figure out and hack GREG myself, and thanks for so many helpful information.

fletcher commented 9 years ago

I didn't have to hack greg, though there may be room there to improve the begin/end tracking. I stuck with hacking my parser since I was already familiar with that but not familiar with greg's source.

You are, of course, welcome to write "yet another markdown editor for the mac". It's an interesting project to learn from, but it's gotten to be a crowded space if you're looking to make money, unless you have interesting ideas that are different from the ones that everyone else is doing.

As for preview scrolling sync, that's also an interesting problem. The easy approach, used by Mou, for example, is to simply scroll based on percentage (e.g. 25% of the text editor maps to 25% of the preview). For basic documents, this approach may be ok. But for other documents, it is not very helpful.

MultiMarkdown Composer attempts to map the structure of the document so that the editor and the preview map to the same area, but even this is more complicated than it sounds (what about footnotes, for example?). Again, my code to handle this is not open-source. It is not perfect, but generally does a very good job, and is certainly more accurate than Mou when scrolling.

orklann commented 9 years ago

Yep, I am working on a "yet another markdown editor for the mac", since I don't get satisfied with other markdown editors available, maybe I need something new, and I am trying to make some neat idea on them, and yes, it's so crowded in this vertical market, but anyway I like to try.

Yep, Mou just uses percentage of both editor and preview view, so if there are some image tags in the editor, than that will be not accurate at all.

In Preview side, we can just use hashtag and Javascript to scroll to that element with the hashtag, It's easy, the main problem will be in the editor side, if we can get line number of the current node (markdown node where the caret is in), then we can just simply make the corresponding node in HTML to be at the top of Preview(WebKit).

The way I trace elements between both side is to automatically generate IDs for them, and corresponding elements in both side have the same ID.

fletcher / MultiMarkdown-4

Can I get line/column information from a node? #128