ChristopherBiscardi / gatsby-mdx

Gatsby+MDX • Transformers, CMS UI Extensions, and Ecosystem Components for ambitious projects
https://gatsby-mdx.netlify.com/
715 stars 100 forks source link

Component to render code snippet given git revision and line ranges #70

Closed hsribei closed 5 years ago

hsribei commented 6 years ago

Is your feature request related to a problem? Please describe. I'm always frustrated when I check the docs for a function (say at reactjs.org) and it doesn't have a "View source" link that immediately shows me the implementation inline.

Describe the solution you'd like I'd like to have an object that I could use like this:

<GitSnippet
  revision={"master:./README"}
  highlightRange={"5-6, 14-17"}
  hidelineRange={"1-3"}
/>

The revision field would ideally accept as much as possible of the gitrevision syntax (example). Highlighting and "hidelining" could be delegated remark-embed-snippet.

Describe alternatives you've considered The alternative seems to be copy/pasting source chunks into markdown, making it error-prone, impractical at scale, and hard to update.

Additional context This issue is a continuation of a twitter thread discussing the idea.

ijsnow commented 6 years ago

Ah! I guess I slightly misunderstood your initial tweet. Don't worry, I can help with this as well and it'll go well with what I had in mind with codeintellify

So sourcegraph.com exposes a graphql API where you can fetch the contents of a file at a specific revision. Here's an example of it fetching this repo's package.json. This will work out of the box for any open source library!

ijsnow commented 6 years ago

I can open another issue for this if need but I think we could build the codeintellify support (tooltip with docs/go to def/find refs) right into the GitSnippet pretty easily! The information provided to GitSnippet would be sufficient for codeintellify to work.

I'd love to help with the implementation. How should we move forwards?

ijsnow commented 6 years ago

After thinking about it a little more, I think this could be published on it's own and doesn't necessarily need to be a part of gatsby-mdx. It's more so that gatsby-mdx allows for this to be easily used in Gatsby sites!

Let me know your thoughts on this.

ChristopherBiscardi commented 6 years ago

After thinking about it a little more, I think this could be published on it's own and doesn't necessarily need to be a part of gatsby-mdx. It's more so that gatsby-mdx allows for this to be easily used in Gatsby sites!

Totally, This is a multi-package repo intended to foster more of the ecosystem than just the gatsby-mdx package. We've started this with gatsby-plugin-mdx-deck, are expanding to things like ui-extension-contentful and ui-extension-netlify for authoring MDX in CMS', and I want to offer component libraries as a drop in to MDXProvider's components as well (as separate packages). We would be able to ship such things as "documentation starters" that extend from the core "docs components".

This work roughly falls under: https://github.com/ChristopherBiscardi/gatsby-mdx/issues/80 even though GitSnippet as we're thinking about it today doesn't replace a <pre><code> block.

So sourcegraph.com exposes a graphql API where you can fetch the contents of a file at a specific revision

RE: codeintellify, is that a more dynamic thing? Would it run graphql queries at runtime to display the popovers or is there a gatsby-source-sourcegraph that we could use to fetch the data at build time? It seems to me that the best way to accomplish GitSnippet is to let it grab the data at build-time (and the easiest/most dynamic is to let it grab at runtime). We might be able to embed StaticQueries into the component or could do some other AST traversal in the MDX pipeline to pull out and execute the relevant queries for gitsnippets.

I added a gitsnippet package in a new components directory to kick this off. It has a README basically repeating what we've said here and on twitter to describe the intended result. I'm interested in seeing the runtime version first if that's easier, but long-term I think doing a build-time version could be a stronger move for offline-capable sites, etc. I'll add y'all as maintainers if you want to get this going :)

ijsnow commented 6 years ago

Ah perfect. Sounds like we're on the same page.

Codeintellify doesn't talk to our graphql API. It talks to a language server to get the information. It can be any language server but I'd suggest just pointing it at Sourcegraph's language servers because we've optimized them for the web(LSP was originally just intended for running locally for your editors). We have a proxy that we just send all requests to for each language that handles forwarding them to the proper language and a few other optimizations that just make it super easy to use.

More specifically, you give codeintellify two properties that are functions that are expected to communicate with whatever LSP you want to use. See our implementations for our browser extension here if you're curious.

The full list of options is here. Sorry, there's not documentation yet as the two authors(including me) have been using this internally. Maybe we can use this to document itself :)

I can own all parts of the implementation including communicating with Sourcegraph or codeintellify. I started to look into making it last week and started having questions around what output I should use for the code table. I wanted to use the built-in gatsby one but am not sure how to implement that inside a react component to be honest.

Our product's gatsby site doesn't have code examples(other than config) and I don't work on that much.

ijsnow commented 6 years ago

A little more about codeintellify.

Overview

It won't be as easy as

<GitSnippet ... />
...
<GitSnippet ... />
... 
<GitSnippet ... />

but it will be as easy as

<CodeIntelligenceProvider>
  <GitSnippet ... />
  ...
  <GitSnippet ... />
   ... 
  <GitSnippet ... />
</CodeIntelligenceProvider>

Details

A single page can have many code examples. When building codeintellify, we build this support in for PR pages showing many diffs. We create one "hoverifier" and "hoverify" many "code wiews" with that one "hoverifier". This ensures we only have one tooltip showing at once and so the page doesn't have to do the same work multiple times.

I don't know whether having children in react components passed in like this fits best practices or norms in mdx but it is the best way to use codeintellify without potential for certain things feeling buggy(like multiple tooltips showing up).

We do something similar in our web app for diff pages. In our browser extension we just pull the code views from the DOM so we don't use react.

hsribei commented 6 years ago

it will be as easy as

<CodeIntelligenceProvider>
<GitSnippet ... />
...
<GitSnippet ... />
... 
<GitSnippet ... />
</CodeIntelligenceProvider>

I actually prefer it like this because it keeps things decoupled.

There are 3 potential scenarios I can imagine, let me know if this matches what everyone's thinking:

  1. I think it should be possible to statically render a <GitSnippet ... /> into pure html+css to be included on the page without any code intelligence. It would be just like a regular code block, but instead of hardcoded into the markdown, it would be sourced at build time from a git repository.

  2. This <GitSnippet ... /> could then be progressively enhanced on the client-side via <CodeIntelligenceProvider ... /> which could query a language server via ajax to provide code intelligence.

  3. Ideally though, it would be possible to move 2. into build time, as I think hinted by @ChristopherBiscardi. This means that both the code snippet and the code intelligence could be sourced and rendered at build time. For instance, UI that shows on hover could be shipped in the prerendered html and then toggled via css :hover. Some JS could be used, but there would be no need to make network requests.

My concern is that although it seems like the code intelligence part can be done completely from open source pieces (codeintellify + LSP server), querying the code itself from graphql requires a sourcegraph server. It's not a show-stopper, but it would be better if this dependency was easy to replace later.

ChristopherBiscardi commented 6 years ago

At minimum, I think the Provider could be placed into a layout (default export for mdx or defaultLayout), which would apply it as described. Presumably one could even use the new wrapRootElement APIs to have it apply to the whole site by default. (I also like this decoupling)

export default ({ children, ...props }) => <CodeIntelligenceProvider {...props}>
  {children}
</CodeIntelligenceProvider>

# Some stuff

<GitSnippet {...} />

more content

<GitSnippet {...} />

Out of curiosity, I noticed all the types on that diff page are any. How does sourcegraph pull type information for JS?

ijsnow commented 6 years ago

I definitely don't want to require Sourcegraph for this to work. Obviously, it should work for everyone for free.

Getting the code

For open source, this will work out of the box for anything that is open source and hosted on GitHub because sourcegraph.com pulls automatically pulls in open source projects from GitHub.

However, if we can get the code at build time, that would be great for everyone, including close sourced projects and for offline support. I'll defer to you two on the implementation for that. Don't know much about the gatsby internals yet! My question for this would be: what if people have a separate repo for their docs site? For example, we have sourcegraph/website (close sourced for now, but plan is to open soon) for our about.sourcegraph.com and other repos for everything else.

Code Intelligence

This will be a bit more tricky for close sourced projects not wanting to have their own Sourcegraph instance. This is because they'd have to spin up and host their on language servers. It's definitely possible but will require a lot more work. As I mentioned before, for open sourced code, it'll work out of the box using sourcegraph.com as it already has language servers running and will be free and running forever. We actually are working on making it easier for the community to run their own language servers but that's a work in progress.

How does sourcegraph pull type information for JS?

We're using the javascript-typescript-langserver for code intelligence for javascript. JavaScript is inherently a very hard language to get type information for and our language server for it is admitedly not that great for types. However, it does pull / JSdoc blocks / for things and display that so the more documentation in the code, the better the code intelligence for JS is. We would love community contributions for making it better!

ChristopherBiscardi commented 6 years ago

@hsribei yeah, that matches what I was thinking.

For #1, Doing the query at runtime seems trivial but since this is a static site generator I believe that supporting the "pre-fetch and include as MDX" approach you described to be the right approach. I have to think a bit on how best to add support for MDX components that want to make graphql queries based on their props at build time. I can think of two potential places in the AST transformation that this would make sense to do, one of which basically has full-fledged babel support. While the runtime query approach could use a generic React component (the output is, after all, "just components"), I believe the build-time query approach will need some kind of babel plugin to handle the processing.

For the actual code, what's the best place to fetch it? I see a few different possibilities:

  1. github.com
  2. The current git repo
    • "give me the content of the blob at this filepath 3 commits ago from this repo"
    • not sure how useful this is
  3. Any arbitrary remote git repo
ChristopherBiscardi commented 6 years ago

Certainly seems like the first version of this should work for open source projects and we can add support for auth'd/closed projects later.

After checking out the GitHub APIs, etc: The sourcegraph API (as described earlier) seems like the best starting point for getting file content. Maybe we can implement alternate backends later, but that definitely seems like the easiest/quickest/current best approach to start with.

ijsnow commented 6 years ago

Solving for open source first sounds like a solid plan.

How should we get started? It seems like I'd be the biggest help with the Sourcegraph communication and codeintellify stuff. As far as generating, I've seen the gatsby snippet plugin but not sure what we want to do.