zakhenry / embedme

Utility for embedding code snippets into markdown documents
MIT License
229 stars 38 forks source link

[Feature Request] Support start and end tags for embedding snippets of code #48

Open srnagar opened 4 years ago

srnagar commented 4 years ago

I have been using this tool and find it very useful to ensure our documentation is updated every time code changes. Most of the documentation requires only a small snippet from the source file and for which we are currently using line numbers. The challenge with line numbers is if there are multiple snippets in the same source file, updating one snippet may cause the line numbers of all other snippets to change requiring a lot of effort to update the line numbers in markdown files.

One proposal to solve this is by having start and end tags for code sections that should be embedded. So, as new lines are added or removed, the tags will keep the code snippets in tact.

For e.g.

In source file (Java)

// embedme_start: tag-name
code snippet to be embedded
// embedme_end: tag-name

In markdown file:

<!-- embedme /path/to/source/file/Sample.java#tag-name -->
zakhenry commented 4 years ago

@srnagar I love this idea :) I actually have the same problem sometimes but haven't really thought much about it.

I'll have a think and might take a crack at it

srnagar commented 4 years ago

@zakhenry Thank you for your quick response! Looking forward to this :)

zakhenry commented 4 years ago

Options for anchored embed

Hi @srnagar thanks again for your feature request, I've put a bit more thought into it and realised there are few options/difficulties with the API that should be ironed out before implementation begins.

In the markdown file

It is pretty clear how this will work in the markdown file:


    ```ts
    // path/to/file.ts#includeThisEnum

I think this syntax is fairly uncontentious, the difference between an anchor
identifier and the line numbering type is pretty intuitive. If for some reason a
developer uses an anchor syntax that happens to look like a line numbering, it should
be pretty clear from the logs what has happened. It seems extremely unlikely that
someone would try identify a docblock as something like `L42-L50`.

## In the source file
This is where it gets difficult, there's a problem in that the embedme library 
does not want to understand the AST of the language, and be able to work out what
of the following code needs to be embedded. Here's a few options outlined, and some
discussion of their pros/cons

### Option 1 (html style)

```ts
export interface ThisIsIgnored {

}

/**
 * <embedme id="includeThisEnum" offset="+1, -1">
 */
export enum IncludeThisOne {

}
// </embedme>

export class IgnoreThisToo {

}

In this option, the syntax is basically an pair of matched html tags with the tag name being embedme and the attributes defining the identifier and other options

Option 2 (duplicated tags)

export interface ThisIsIgnored {

}

/**
 * @embedme#includeThisEnum
 */
export enum IncludeThisOne {

}
// @embedme#includeThisEnum

export class IgnoreThisToo {

}

Option 3 (start & end symbol)

export interface ThisIsIgnored {

}

/**
 * @embedme#includeThisEnum<<<
 */
export enum IncludeThisOne {

}
// @embedme#includeThisEnum>>>

export class IgnoreThisToo {

}

There's so many options of syntax with this, none of them good I fear

Option 4 (start & end tags)

export interface ThisIsIgnored {

}

/**
 * @embedmeStart#includeThisEnum
 */
export enum IncludeThisOne {

}
// @embedmeEnd#includeThisEnum

export class IgnoreThisToo {

}

I think I'm biased towards option 4, but would like feedback regardless

Line offsetting

I think it makes sense for the source file to be able to define line offsetting because it might be fairly common for an embedded snippet to want to include or exclude an existing docblock.

Example (using option 4 above)

export interface OtherCode {

}

/**
 * @description - takes any number of numbers and adds them up.
 * @returns number - the addition of all arguments
 * @example - add(1, 1); // returns 2
 * @example - add(1, 2, 3); // returns 6
 * @embedmeStart#addWithDocBlock
 */
export function add(...operands: number[]): number {
    return operands.reduce((s, o) => s + o, 0);
}
// @embedmeEnd#addWithDocBlock

export class FurtherCode {

}

In this scenario we want to either a) ignore the docblock entirely, or b) include it.

This can be done one of two ways - either in the markdown file, something like


    ```ts
    // path/to/file.ts#includeThisEnum(-5)
Where it is assumed the the default case ignores the `@embedme` lines

Or, we could do it in the source file itself:

```ts
export interface OtherCode {

}

/**
 * @description - takes any number of numbers and adds them up.
 * @returns number - the addition of all arguments
 * @example - add(1, 1); // returns 2
 * @example - add(1, 2, 3); // returns 6
 * @embedmeStart#addWithDocBlock(-5)
 */
export function add(...operands: number[]): number {
    return operands.reduce((s, o) => s + o, 0);
}
// @embedmeEnd#addWithDocBlock

export class FurtherCode {

}

Perhaps both options could be supported, with the markdown document taking precedence in the case that both are defined.


Either way there are a few difficulties with this proposal that may need to be ironed out before implementation.

I also suspect the simple procedural approach this library has taken is reaching it's limits of abstraction, so I should probably do a decent refactor to make it a more pluggable and testable architecture before supporting more complex behaviours like this one.

I'm still keen to add this feature, but it's looking like it may not quite be as straightforward as I'd hoped.

srnagar commented 4 years ago

@zakhenry Thank you for following up on this and laying out a detailed list of options!

I like option 4 too! Having a clear start and end tags makes it easy to maintain and understand.

For line offsetting, I am assuming you are referring to whether embedded snippet should include the documentation around the code.

By this syntax @embedmeStart#addWithDocBlock(-5), you mean to include the previous 5 lines from this start-tag plus all the lines below it until the end-tag, correct?

The other option is to have start and end tag be in its own separate comment (line) block.

For e.g. to include the documentation, we could use this syntax:

// @embedmeStart#addWithDocBlock
/**
 * @description - takes any number of numbers and adds them up.
 * @returns number - the addition of all arguments
 * @example - add(1, 1); // returns 2
 * @example - add(1, 2, 3); // returns 6
 */
export function add(...operands: number[]): number {
    return operands.reduce((s, o) => s + o, 0);
}
// @embedmeEnd#addWithDocBlock

To exclude the documentation, we could use this:

/**
 * @description - takes any number of numbers and adds them up.
 * @returns number - the addition of all arguments
 * @example - add(1, 1); // returns 2
 * @example - add(1, 2, 3); // returns 6
 */
// @embedmeStart#addWithoutDocBlock
export function add(...operands: number[]): number {
    return operands.reduce((s, o) => s + o, 0);
}
// @embedmeEnd#addWithoutDocBlock

With this syntax, the @embedme tag gets its own comment line (nothing else should be included in that comment - this also makes it easier to validate tags) and everything within the start and end tags is embedded without needing to know what's inside the tags (code or documentation).

One issue with line offset is that when new lines are added or existing lines are deleted this offset also has to be updated.

PS: I understand and appreciate the complexity involved in refactoring and adding this feature. Thank you for keeping this issue open and attempting to enable this feature!

rodmax commented 4 years ago

Hi guys. Nice tool and nice feature request. My suggestion is similar to https://github.com/zakhenry/embedme/issues/48#issuecomment-570348249 but even more compact

// @embedmeStart#myTag
...code lines
// @embedmeEnd

My suggestion is based on the following asumptions:

p.s. In any case this feature will be very useful for me, thanks

milesfrain commented 4 years ago

The folks who wrote mdbook put some thought into this already. It would be great to match their format to maximize compatibility in case anyone wants to host docs both via the github viewer and on a dedicated webpage. The mbdook format is really similar to the ideas already proposed in this thread, which is a good sign that we're on the right track:

/* ANCHOR: all */

// ANCHOR: component
struct Paddle {
    hello: f32,
}
// ANCHOR_END: component

////////// ANCHOR: system
impl System for MySystem { ... }
////////// ANCHOR_END: system

/* ANCHOR_END: all */

I think it's best to support matching end tags in case a user wants to create overlapping snippets, even though this is unlikely, and risks typos in the end anchors:

// ANCHOR: foo
... foo only
// ANCHOR: bar
... foo and bar
// ANCHOR_END: foo
... just bar
// ANCHOR_END: bar

Results in this snippet for foo:

... foo only
... foo and bar

And this snippet for bar:

... foo and bar
... just bar

It might be possible to support both options and assume that missing end labels means nesting, although a consistent scheme probably needs to be enforced per-file to simplify parsing and avoid multiple "correct" outputs.

ivank commented 2 years ago

Hey guys, I've recently stumbled on this project, and I wish I would have done this earlier, since I coded something quite similar myself - https://github.com/ivank/build-readme#readme , totally unrelated. And it does have the "snippet" feature people here are talking about.

It might not be as polished as this tool, but I've been using it for quite a while in one of my OSS mono-repos and it helped me stay sane supporting the sprawling documentation there - https://github.com/ovotech/laminar