elastic / package-spec

EPR package specifications
Other
17 stars 70 forks source link

[Change Proposal] Compatibility with next Elastic docs #166

Open goodroot opened 3 years ago

goodroot commented 3 years ago

Proposal: make packages/integrations compatible within the next Elastic docs system.


The next docs system uses MDX, an extension of markdown that supports jsx.

We need to update templates / generators as required to support MDX, and therefore be consumable by the system.

Integrations is a solid candidate for our "MVP" of this system, and we can make major steps once these files are supported.


Requirements

  1. Add frontmatter so that the pages can be interpreted and organized within the next system.

ie.

---
id: awsIntegration
slug: /integrations/aws
title: AWS Integration
image: An image?
summary: Summary lalala.
tags: ['integrations']
related: ['fleet', 'observability']
date: 2077-10-13
---

# AWS Integration

{content}
  1. Apply the .mdx extension to READMEs instead of .md.
  2. Update documentation for future contributors

It is understood that these files, or a version of these files, will be consumed by Kibana for use within the integration flow.


Related links

mtojek commented 3 years ago

cc @ycombinator @ruflin @jen-huang

mtojek commented 3 years ago

Few observations regarding this idea:

The next docs system uses MDX, an extension of markdown that supports jsx.

Is there any public/internal guide about the new system? It would be cool to review it first :) I'd like to see all additional features it provides comparing to ordinary .md files.

Apply the .mdx extension to READMEs instead of .md.

We can't simply change from .md to .mdx for two reasons:

  1. Backward-compatibility between package versions, which means that already released packages would have to use .md. We may have deprecate .md , but long-term they will have to stay supported, but potentially not consumed by your system.
  2. These READMEs are rendered by the Kibana catalogue of packages and Github preview, for example: https://github.com/elastic/integrations/tree/master/packages/azure . Is there support for .mdx in Github?

Update documentation for future contributors

We may need some guidance regarding editing of these files, but in general that should be fine.

ruflin commented 3 years ago

We expect the README to be also consumed by the website at one stage in addition to Kibana. And pretty often the markdown files are also just read here in Github. I know basically nothing about MDX so I wonder how it will work in these environments?

goodroot commented 3 years ago

Is there any public/internal guide about the new system? It would be cool to review it first :) I'd like to see all additional features it provides comparing to ordinary .md files.

The new system is a complete overhaul of elastic.co/guide, and less an interpolation system relating to markdown.

Best place for info is to meet -- happy to meet up for Zoom or similar anytime to get deeper on the high-level goals. πŸ˜ƒ

Second best source is within the initial project proposal, which is internal so I will not link -- please reach out for access.


As for .mdx, let's back up a bit. :)

.mdx is an enrichment on-top of markdown that lets us inject jsx (react) components.

It preserves markdown and its function as you would expect in the majority of cases.

It sounds like -- for the time being -- the softer idea is to consume pure markdown (.md files).

No problem!

If we need to take .md and -- for our own purposes -- convert it and enrich it with jsx, let us look at that as a separate case.

For now, an addendum to the proposal:

  1. Add frontmatter so that the pages can be interpreted and organized within the next system.

ie.

---
id: awsIntegration
slug: /integrations/aws
title: AWS Integration
image: An image?
summary: Summary lalala.
tags: ['integrations']
related: ['fleet', 'observability']
date: 2077-10-13
---
  1. Update documentation for future contributors.
ycombinator commented 3 years ago

Sounds like we have two possibly-conflicting goals we need to satisty:

Additionally, there's the usual goal of:


Currently, the package spec requires that every package define a docs/README.md file. This file may be created by hand or generated from an optional template file, _dev/build/docs/README.md, by running elastic-package build.

What if we enhanced the package spec and elastic-package build as follows?

  1. The package-spec would require a new file, docs/README.mdx.
  2. This file would contain all the same contents as docs/README.md but with the additional frontmatter. This file may be created by hand but it would be recommended to use elastic-package build to create it.
  3. Running elastic-package build would take docs/README.md (not _dev/build/docs/README.md because it is not a required file) and automatically add the additional frontmatter. Fields in the frontmatter would be derived from fields in the package-level manifest.yml file or other contextual information (e.g. the date field).
    • Any required fields in the frontmatter that cannot be currently derived from the package-level manifest.yml file would be added as required fields to the package spec.
    • Any optional fields in the frontmatter that cannot be currently derived from the package-level manifest.yml file would be added as required fields to the package spec.

The end result would be that every package would end up with two required files:


An alternative proposal would be for Kibana to be able to render .mdx files but I don't know enough to know if this is possible, what effort it might take, if there are any security concerns with potentially rendering arbitrary JSX, where such an effort might land in terms of team priorities, etc. For these reasons, I favor the above proposal, at least for the immediate term.


Thoughts?

ruflin commented 3 years ago

@goodroot My understanding was that it should be fairly simple for Kibana / Website to support mdx?

goodroot commented 3 years ago

@ruflin Yep -- it would be. It's an easy fit into their environment given current usage of components.

any security concerns with potentially rendering arbitrary JSX

Definitely interested if you see any flags.

AFAIU, given that it compiles React components at build, the end result is a typical Kibana page.

ruflin commented 3 years ago

Did we already start to use MDX in our docs system? I'm currently still working with asciidoc so I wonder which parts we moved over?

For the implementation on the package side, one option I could see that we offer both files but only one of the two can exists. So if the README.mdx exists, we pick it up and process it, otherwise we fall back to the .md file. This makes sure we are forward compatible.

goodroot commented 3 years ago

Did we already start to use MDX in our docs system? I'm currently still working with asciidoc so I wonder which parts we moved over?

@ruflin The migration has not started yet.

This would be the first case (MVP) of the new docs, and so we're migrating MD --> MDX.

Eventually, the existing Elastic documentation will move from asciidoc to MDX. It'll be wild. πŸ™‚

Having multiple files would allow the frontmatter to live in one version and not the other. Maybe that works better for you? πŸ€”

ruflin commented 3 years ago

To be fully honest I would prefer not to be the first one to move over to a new doc format. There have been several attempts in the past to move docs (without success) which means this could leave us in an awkward position later on. And as far as I know, we currently don't have any issues with using .md either ;-) So maybe we can pause this effort for a few weeks / months until there has been some more progress on other fronts?

goodroot commented 3 years ago

@ruflin Let me clarify: in this case, there is no need to change formats.

It is sufficient for our purposes to add frontmatter to the existing markdown files -- no migration required.

dedemorton commented 3 years ago

@ruflin If we pause this effort now, we still have the problem that started this effort: we need to publish the integrations documentation outside of Kibana.

exekias commented 3 years ago

From what I see this is about adding a small header with metadata on top of the current README files, right? Perhaps we can do this using a script outside the integrations repo? I think all the info present in this header is available in the packages manifest, right?

This would allow to start experimenting while keeping the spec & repos as they are.

any security concerns with potentially rendering arbitrary JSX

I share these concerns. This approach wouldn't necessarily affect Kibana (as README.md is kept as it is now), but could maybe allow contributors to inject requests in the docs page (elastic.co/guide?), which has security implications. We need to dig more here.

ruflin commented 3 years ago

I like the idea from @exekias . If some metadata is missing, we could add it to the manifest.yml if needed.

goodroot commented 3 years ago

... maybe allow contributors to inject requests in the docs page (elastic.co/guide?), which has security implications. We need to dig more here.

Fair concern. πŸ™‚

For instances where we "hydrate" using the client, we pass mdxSource only after it's been rendered to a string:

export interface RenderToStringResult {
  mdxSource: {
    compiledSource: string;
    renderedOutput: string;
  };
  frontmatter: { [key: string]: unknown };
}
  const body = hydrate(props.mdxSource, { components: providerComponents });

This follows recommended security guidance.

If we passed this before rendering to a string, we'd be exposed to risk.


Perhaps we can do this using a script outside the integrations repo?

A separate script sounds like it might work.

However, adding frontmatter to the markdown directly would allow me/others to do it, and make quick work of it.

I hesitate to require you folks to maintain a utility to add and maintain this meta-data.

Relatedly, when Kibana renders Markdown, it will be through EUI's recommendation.

We can have the frontmatter escaped and dismissed by Kibana when required.


I think all the info present in this header is available in the packages manifest, right?

If some metadata is missing, we could add it to the manifest.yml if needed.

Yep -- we will need to add additional data over time.

ruflin commented 3 years ago

One thing to keep in mind here, there is a future where not all packages will be owned by Elastic. Will these packages also show up in our docs?

dedemorton commented 3 years ago

Will these packages also show up in our docs?

Based on my experience working on the Logstash plugin docs, I would say...proceed cautiously. There be dragons. Users will want to know which integrations are supported by Elastic vs contributed by the community. And this question gets fuzzy when someone contributes code for a widely used integration, and Elastic ends up investing a lot of effort in making it better.

goodroot commented 3 years ago

One thing to keep in mind here, there is a future where not all packages will be owned by Elastic. Will these packages also show up in our docs?

This is the sort of case for which we'd expand frontmatter.

Maybe as: first-party: true -- I then imagine we would host it in our docs, but delineate it.

I am sensitive to the security aspect, and also that it can be a mess. πŸ˜„

I find confidence in that we can remain flexible.

sorantis commented 3 years ago

Any update on this issue?

goodroot commented 3 years ago

Returning to the proposal from @ycombinator as restart point.

The next steps then:

  1. Agree on frontmatter as per your requirements for docs -- a PM or similar please connect with me to do so
  2. Add frontmatter to all docs we want to consume
  3. Add GitHub Action to repo

We have more figured out now than before and are still flexible for your requirements. πŸ‘

mtojek commented 3 years ago

We do not need the content to be in mdx format and can consume/transform standard md files

πŸ‘

We will need to add frontmatter to all files that we will consume

πŸ‘ , but let's agree on the frontmatter form next. We should try not to store there redundant data if some properties can be generated/extracted from a package.

We can make builds fast by adding a GitHub Action to this repo that uses cp syntax to move just md files

We can help with that, but I suppose that source of docs should be here. For consistency we should use Jenkins instead.

jsoriano commented 3 years ago

(Sorry, I mentioned these points already in some internal discussions so it may look a bit redundant to some people).

I don't think we should add the frontmatter to the spec. From what I see it contains information that is already available in the manifest.yml file. adding it to the README too makes it redundant, and requires us to invest on tooling to keep this synchronized, or expect inconsistencies in the future. We might consider adding more fields to the manifest if needed. If there is going to be some process to collect the READMEs in any case, this process could also collect the info from the manifest, and generate the frontmatter or any other specific format it needs. We shouldn't couple the package format with the final documents format.

Also, I don't think that we should use the package storage as source of truth. Package storage is an implementation detail of the package registry, and may change soon, we don't want to have things built upon it because they will complicate further refactors in the registry. A good source of truth may be the registry APIs, they expose all the published packages, and can be used to collect all the required information, as Kibana does.

mtojek commented 3 years ago

Also, I don't think that we should use the package storage as source of truth. Package storage is an implementation detail of the package registry, and may change soon, we don't want to have things built upon it because they will complicate further refactors in the registry. A good source of truth may be the registry APIs, they expose all the published packages, and can be used to collect all the required information, as Kibana does.

I would agree with you if we don't have packages that are directly contributed to Package Storage, e.g. APM or Endpoint. I'm also afraid that it will make some confusion to our users. With current PS/EPR design it's possible not to release an Integration and keep it in "snapshot" phase. The enduser won't see it, but will find docs describing it.

We can acknowledge it as a risk of course.

jsoriano commented 3 years ago

I would agree with you if we don't have packages that are directly contributed to Package Storage, e.g. APM or Endpoint.

Well, from the perspective of docs collection, these packages are also available from the registry APIs, so they are no different to other packages, if these APIs are used :slightly_smiling_face:

I'm also afraid that it will make some confusion to our users. With current PS/EPR design it's possible not to release an Integration and keep it in "snapshot" phase. The enduser won't see it, but will find docs describing it.

Not sure to follow here. If we publish docs only for packages available in the registry, then users will only find docs for published packages. We can discuss though about collecting docs from the snapshot/staging registries too.