simdem2: better context parser implementation

Azure / simdem

Tool for Simulating Demo's, delivering Tutorials and using documentation as tests.

MIT License

34 stars 17 forks source link

simdem2: better context parser implementation #89

Closed miyuchina closed 6 years ago

miyuchina commented 6 years ago

This implementation of ContextParser uses a custom token class, Section, to help parse documentation sections like prerequisites and next steps. A Section is a heading followed by other block-level tokens. Those block-level tokens (Section.children) are further parsed to look for links to external documentations.

Parsing prerequisites or next steps are previously done by manipulating the output of mistletoe's ast_renderer.get_ast. The current solution is less fiddly, and one can easily extend ContextParser to accommodate other kinds of Section, should needs arise. (See also #85).

Because ContextParser is no longer dependent on mistletoe.ast_renderer, the test file for that can be removed.

The current implementation requires mistletoe v0.5 or above.

msftclas commented 6 years ago

All CLA requirements met.

SorraTheOrc commented 6 years ago

@lastcoolnameleft this PR is quite old and the code has moved on considerably. Have the recommendations within this PR been implemented? Can we close this?

SorraTheOrc commented 6 years ago

ping @lastcoolnameleft

lastcoolnameleft commented 6 years ago

I'm back from my travels and will review this next week. Thanks for the ping as I missed this earlier.

lastcoolnameleft commented 6 years ago

@miyuchina I deeply apologize for not incorporating this sooner as I missed it during my refactoring. I think I've diverged from your PR now that I'm not sure the best way to re-incorporate your proposal. After the feedback from you and @rgardler I realized I was going down the wrong path and did a course-correction.

The Context class is gone now, and I've instead implemented it as "SimDem1" parser, which has a different structure. (e.g. all of the commands and text are in the "body" section, next_steps are now called "toc", etc.)

For example, this document: https://github.com/Azure/simdem/blob/simdem2/examples/results-block-fail/README.md Is now parsed into: https://github.com/Azure/simdem/blob/simdem2/examples/results-block-fail/expected_result.seo

Also, https://github.com/Azure/simdem/blob/simdem2/examples/markdown-syntax/README.md is parsed to: https://github.com/Azure/simdem/blob/simdem2/examples/markdown-syntax/expected_result.seo If you notice, the JSON object has text somewhat like Markdown so when someone runs it, they can see emphasis. See https://github.com/Azure/simdem/blob/simdem2/examples/markdown-syntax/expected_result.seo#L30 )

Is that still possible given the design in this PR? Here's the new SimDem1 parser: https://github.com/Azure/simdem/blob/simdem2/simdem/parser/simdem1.py

I tried to incorporate it, but hit a wall when I tried to add the text emphasis.

Thanks, -Tommy.

miyuchina commented 6 years ago

Ah, in that case I think your implementation works fine, and I can close this pull request.

Regarding this gotcha, it might help if you rewrite the __getattr__ method this way:

def __getattr__(self, name):
    if name.startswith('render'):
        return lambda token: ''
    raise AttributeError('{} has no attribute {}'.format(repr(type(self).__name__), repr(name)))

... it would raise an AttributeError every time you call a function that does not start with render. This might make errors easier to spot.

Thanks again for the interest!

lastcoolnameleft commented 6 years ago

Thanks for the suggestion! I added it here:

https://github.com/Azure/simdem/commit/477beee65c3740b894d8dd73b378d2311c67ecd2