kevin940726 / remark-code-import

📝 Populate code blocks from files
https://npm.im/remark-code-import
MIT License
63 stars 11 forks source link

Accept `basePath` option. #17

Open johnrom opened 2 years ago

johnrom commented 2 years ago

This allows you to specify a base path where all of your markdown files can source code from. This allows documentation to remain maintainable by not worrying about where within the docs you are sourcing code from.

Supersedes #11

Example:

The filesystem

my-app/
  content/
    guides/
      advanced/
        my-advanced-example.md
    my-example.md
  src/
    my-script.js
  remark-plugins.js

remark-plugins.js

const plugins = [
  {
    resolve: 'remark-code-import',
    options: {
      async: true,
      basePath: '.',
    },
  },
];

my-example.md OR guides/advanced/my-advanced-example.md

```js file=./src/my-script.js


As you can see, it adds less overhead for documentation maintenance to be able to use a base path from which to source file content.  Now I can rename the directories and move documents around without concern of breaking the documentation.
kevin940726 commented 2 years ago

Hi! As I said in #11, I don't think an option should change how relative paths work. The imported files will always be relative to the markdown files, this is just what this plugin is about essentially.

If you want to opt-out of this behavior, you can already do that by specifying absolute paths in the code block meta. I understand if specifying the full disc path every time could be tedious, so I propose an optional rootPath option to change the root directory the paths resolved from.

If the project lives in /home/projects/my-project and sets rootPath to /home/projects/my-project, all absolute paths will be referencing from that path instead of the system root path (/). So file=/src/path.js will point to /home/projects/my-project/src/path.js instead.

What do you think of this approach? I think it's more predictable and doesn't change how the plugin works.

johnrom commented 2 years ago

@kevin940726 I'm sorry I didn't really know what you meant before, but now I understand. However, I think what you propose would be a breaking change, whereas mine is not. In Linux and Mac, starting with a slash denotes an absolute path (root of Filesystem). That means currently, it is possible for Linux and Mac users to use a full absolute path to their Filesystem (path.resolve will treat a leading slash in any segment as absolute to the root fs). Users can mix these with paths relative to the file - it supports both.

On the other hand, using a relative path does exactly what it means - a path relative to somewhere. Without the basePath option, this would mean the same thing it does today - - a path relative to the current markdown file. With a basePath configured however, it would mean a path relative to the basePath - essentially this PR is opt-in, meaning it's not a breaking change at all.

If there is a preprocessor before this plugin, someone might add an environment variable pointing to an entirely different project to source their code snippets, using rootPath as you describe would not allow a person to use this absolute path anymore.

Using basePath supports both absolute and relative paths.

If you would like to support all three conditions - abs paths, paths relative to new basePath, paths relative to file - it is somewhat of a convention (not a standard however) for ~ to denote "relative to a special place". It could make sense to add that, so users can do ~/my-path to point to the path relative to basePath, however for Linux and Mac users I believe this would also mean relative to the home directory so maybe that wouldn't work either.

We could take a page from vscode's configs and check for something like {basePath}.

kevin940726 commented 2 years ago

I don't think it would be a breaking change. We can only reference absolute paths from rootPath if it's specified in the options, otherwise just behave like what it currently does. By providing the rootPath options, you intentionally opt-in to this new behavior to transform all absolute paths in code block meta. I think it should be fine given that there's rarely a case to import file from outside the project's root, we could even argue that it'd be a security issue to allow that by default. We'll probably forbid importing files outside the root directory of the project by default in future major versions too.

Though I understand the convention of leading slash may not be ideal. In which case, I'd suggest a special token like <rootPath> to indicate the exact rootPath pass to the options. This is very similar to Jest's <rootDir> approach.

johnrom commented 2 years ago

From a realistic library maintenance standpoint (I'm trying to integrate this into the documentation of a different library), it doesn't make sense (to me) to take away users' ability to reference absolute paths relative to their filesystem as those paths are generally outside of the responsibility of the current project. Overriding the meaning of the absolute path would make a user choose between rootDir configuration, or using absolute paths, when I think both paths relative to a known place and absolute paths should be supported.

We could get the best of all worlds by adding a configuration like tsconfig.paths or webpack's alias. It would allow you to predictably source code from multiple sources.

{ 
  resolve: 'remark-code-import',
  options: {
    paths: {
      myScripts: path.resolve(__dirname, '../my-scripts'),
      // could be absolute, could be relative, it doesn't matter
      myOtherScripts: path.resolve(__dirname, process.env.myOtherScriptsPath),
    }
  }
}

With Tokens:

```js file={myScripts}/my-script.js

### Without Tokens:

````md
```js file=myOtherScripts/my-other-script.js

I think it's less predictable without tokens, so I'd prefer to use some kind of token. In terms of tokens, here are some options:

- `<>`: could conflict with mdx / be interpreted as html
- `{}`: more conventionally used to signify a variable. but, I think MDX also allows variable interpolation `${}` so it might be confusing
- `%myScripts%`: ugly, shouldn't conflict with anything though

I'd prefer `{}`, but that's just what I'm used to. Whatever we choose, we want to make sure if this plugin is removed, it doesn't break remark. I'm actually not sure if there are character limitations in the meta of the markdown AST.

Also, to me {rootDir} would make me think it's supposed to reference `/` so I'd use `baseDir` or `basePath` or similar to reference a reserved path variable. TypeScript's config uses BaseUrl (not sure why as it's not a web address) in conjunction with paths, so the configuration would be like:

{ basePath: '../', paths: { myScripts: "my-scripts" // this is ../my-scripts. } }


````md
```js file=../my-scripts/my-script.js
kevin940726 commented 2 years ago

Neither the rootPath option or the token approach will change any paths outside the file= code block meta. It will not be parsed by remark or other preprocessor but only by this plugin alone.

I'd prefer <rootPath> or <rootDir> to be the token because it's already an familiar concept in Jest.

But then again, I think it's very rare to import any files outside the current working directory. We could still support that though, just not by default. So the rootPath option might still be possible.

johnrom commented 2 years ago

I think it's very rare to import any files outside the current working directory. We could still support that though, just not by default.

What about monorepos? I'm confused by this statement. I'd avoid removing support for abs paths. Overriding the absolute path does not make sense for large projects or teams who could easily put an environment variable in there to point to a different project, and makes a lot of assumptions about how the plugin is used. I think it's best to be assumption-free, add functionality when it's useful, and not create any unnecessary choices.

Using the paths option is going to be the best way to provide the most options for importing, but I understand if you don't want to add that kind of complexity to this project. I just need to find a library that's going to provide the code imports I need for a large codebase.

Maybe it would help to show you what I'm trying to accomplish.

importing code: https://github.com/johnrom/formik/blob/johnrom/v3-docs/docs/overview.md?plain=1#L110 using basePath with my fork: https://github.com/johnrom/formik/blob/johnrom/v3-docs/website/src/lib/docs/remark-plugins.js#L21-L24

This project is part of a huge monorepo full of examples, codesandboxes, and of course, actual code. The docs markdown files aren't even part of the same project as the website that build them using this plugin. I want to create a system which is essentially self-documenting -- the documentation examples can pull from the example projects, which are actually built as part of the build process, and could also include the code of the project itself, so the website examples can be maintained at the same time the example projects are updated as the codebase evolves.

This is sort of my test project for how I'm going to scale documentation as a whole on all the projects I work on. Each of them has different requirements, different directory structure, etc, and probably needs to pull docs from different places.

Let me give you another kind of silly example. I was reading ElasticSearch docs one day, and I noticed this little bug:

// tag::avg-bucket-agg-syntax[]               
      "avg_bucket": {
        "buckets_path": "sales_per_month>sales",
        "gap_policy": "skip",
        "format": "#,##0.00;(#,##0.00)"
      }
// end::avg-bucket-agg-syntax[]

source: https://www.elastic.co/guide/en/elasticsearch/reference/7.15/search-aggregations-pipeline-avg-bucket-aggregation.html#avg-bucket-agg-ex

That's when I got the idea this huge documentation website is very likely using a repository of code examples that are (I assume) built into their build process for the documentation to make sure the documentation is up to date with the latest versions -- like unit testing your documentation. That's the goal, and to achieve a goal like that the code importer needs to be really flexible.

TL:DR; the code importer needs to be really flexible. Can we make this really flexible so I can use it in a bunch of things? This is possible with my paths configuration suggested a few comments back.