Open nimarezainia opened 11 months ago
Pinging @elastic/fleet (Team:Fleet)
@nimarezainia I do see three options here:
@kpollich can you think of any other options?
+1 to the dynamic nature what we need. Just thinking out loud, can this information be placed somewhere in the artifactory directory structure?
Sounds like a good idea. Some initial thoughts and questions:
... The known issues are all documented in the Fleet releases notes also.
Is this an example of the documented release notes? https://www.elastic.co/guide/en/fleet/current/release-notes-8.11.1.html How far back would we want to present known issues? (At which version would we not present known issues?)
Sounds like a good idea. Some initial thoughts and questions:
... The known issues are all documented in the Fleet releases notes also.
Is this an example of the documented release notes? https://www.elastic.co/guide/en/fleet/current/release-notes-8.11.1.html How far back would we want to present known issues? (At which version would we not present known issues?)
@zombieFox yes that is a sample Release note.
How far back we go is a good question, this is the first checkbox in the description. For example: if we are upgrading a bunch of agents, the versions they are on is varied and so would be the advisory that needs to be applied. How far back should be determined by the lowest version in that group of agents selected. I am not sure if this is technically possible. if not we should just pick a number, say last 2 or 3 major releases.
@formgeist Can I assign this to you in order to get the ball rolling?
@jlind23 Yeah, all good - I will have a look at getting someone to do a pass on this in the coming weeks.
Hey @simosilvestri - I spent some time getting some screenshots together + annotating a bit to try and capture the ask here visually. Let me know what you think, and please let me know if I can clarify anything else!
@nimarezainia when you say this:
The agents being upgraded will most probably be at different versions so it would great to show all the known issues that would apply.
Do you mean we'd need to potentially have known issues for "source versions" as well as "destination versions"? I wonder if that kind of logic is needlessly complex here. How regularly do we see issues when upgrading specifically from one version to another? I think the "known issues" are most directly associated with the newer "target version" and that's what we'd be reporting here. I suppose this runs the risk of reporting errors for specific version combinations that might not be present in the selected set of agents (thus adding potential for user confusion), but my assumption is that some extra verbosity here probably isn't going to hurt. Curious for your thoughts on this.
@nimarezainia when you say this:
The agents being upgraded will most probably be at different versions so it would great to show all the known issues that would apply.
Do you mean we'd need to potentially have known issues for "source versions" as well as "destination versions"? I wonder if that kind of logic is needlessly complex here. How regularly do we see issues when upgrading specifically from one version to another? I think the "known issues" are most directly associated with the newer "target version" and that's what we'd be reporting here. I suppose this runs the risk of reporting errors for specific version combinations that might not be present in the selected set of agents (thus adding potential for user confusion), but my assumption is that some extra verbosity here probably isn't going to hurt. Curious for your thoughts on this.
Just to clarify what I meant was that we do allow the user to choose the set of agents to upgrade to a target, so technically the set of agents being upgraded could be at different versions. you are correct it may be needless, as most probably the user has upgraded all the agents previously and they would generally be at the same version level. Also the known issue is really only relevant to the target being upgraded to.
I'm happy if we just stick to highlighting known issues in the target and avoid extra complexity.
That's great! Thank you so much @kpollich!
Hi @nimarezainia and @kpollich, I put together a quick video to walk you through the Known Issues UX enhancement. Please let me know if everything is clear and if you have any feedback. Thanks!
https://github.com/elastic/kibana/assets/162109197/bb7503cc-2049-41f6-bbaf-1e551326a12c
thanks you @simosilvestri - IMO looks great.
Thanks @nimarezainia! @kpollich this is the Figma file with the new UI criteria. Please let me know if you have any feedback.
Thanks @simosilvestri looks great to me! I think we can consider the design phase for this work done :)
I'll work on getting it scheduled in an upcoming sprint.
Thanks @nimarezainia! @kpollich this is the Figma file with the new UI criteria. Please let me know if you have any feedback.
New UI criteria
Show a medium size Flyout component to Upgrade the agent version
- Set the Select component to a max-width of 800px
Display the “Know issues” version in the accordion below the version selector.
- Title: Text/X-small/Heading4
- Body: Text/Small/paragraph->Regular
@simosilvestri Can you please add this link and some description of the design solution that will be implemented? That makes it easier to understand the final implementation. Thanks! 👍
We'll need to engage with @elastic/webteam to get some new content around known issues added to https://www.elastic.co/api/product_versions, or we'll need to block this on the early agent releases work to add a new API endpoint related to agent, making sure to include known issues data along with each agent version.
I think it probably makes the most sense to block this on https://github.com/elastic/ingest-dev/issues/3284 since we should ideally have more control over the API structure to enhance it with known issue content required by this flyout.
cc @pierrehilbert @cmacknz
The product versions API is something we already consume in Fleet, but it is kept up to date via the stack release process. I don't think you want to require a stack release, or an independent agent release, to update the known issues given we can discover them at any time.
I would rather us come up with a convention that sources this content from https://github.com/elastic/ingest-docs and the release notes https://www.elastic.co/guide/en/fleet/current/release-notes-8.14.0.html where we can update at any time with a PR. Perhaps we need a separate https://www.elastic.co/guide/en/fleet/current/known-issues-8.14.0.html to be created that is embedded into the release notes page as part of the docs build so it isn't duplicated. CC @kilfoyle
The complication with docs is you'll have to figure out how to embed it into Kibana at build time for air gapped users. Or perhaps we just don't do this there, since they have to separately fetch the binaries anyway. Maybe the solution for those use cases is to get this same known issues content displayed directly on the download page (e.g. https://www.elastic.co/downloads/elastic-agent) for each version, without the need to click through to the detailed release notes.
Perhaps we need a separate https://www.elastic.co/guide/en/fleet/current/known-issues-8.14.0.html to be created that is embedded into the release notes page as part of the docs build so it isn't duplicated.
Certainly. I can look into embedding a separate HTML page into the Release Notes output. However, from a docs perspective, embedding a separate asciidoc source file would be a lot easier and likely more reliable. Would it be possible for the Kibana UI to grep the asciidoc source instead?
Here's a test PR that shows how all known issues for, say, 8.x could be stored in a single known-issues-8.x.asciidoc
file. The text for each known issue could be grepped by tag (e.g. tag::known-issue-8.14.0-*
) or by ID (e.g. known-issue-8.14.0-*
).
Would it be possible for the Kibana UI to grep the asciidoc source instead?
Greping the source is definitely possible, but rendering the asciidoc content to HTML will be more challenging. A JS port of asciidoctor (which I believe we use in other docs repos) exists: https://docs.asciidoctor.org/asciidoctor.js/latest/, and is MIT licensed so it's probably okay to pull this into Kibana with some guidance from the core team.
We'd definitely want to create an API endpoint to wrap all this behavior and avoid parsing asciidoc on the client side, as well.
One other consideration for pulling docs content is that the source is planned (currently for December 2024) to migrate to MDX format, so indeed, pulling the HTML might be the most practical.
So looks like we will need a new API to provide the list of known issues that will be populated by the ingest-doc repo @kpollich do you know which team is responsible of the https://www.elastic.co/api/product_versions API? looks like that API is backed by content-stack, and I will probably have a few question for them on how we can add a new API.
Also a product question for you @nimarezainia @simosilvestri with the upgrade being in a flyout instead of a modal, we have a lot more room, did we consider showing the whole release note instead of just the known issues? it seems to me it could be interesting too for the user
@elastic/webteam owns the product versions API + content-stack side of things if I remember correctly.
I am worried requiring a new API through elastic.co is over complicating this. The known issues are all in the release notes. They are publicly accessible. You can HTTP GET them now. https://raw.githubusercontent.com/elastic/ingest-docs/refs/heads/main/docs/en/ingest-management/release-notes/release-notes-8.15.asciidoc. You can also get the rendered HTML from a public URL https://www.elastic.co/guide/en/fleet/8.15/release-notes-8.15.4.html
The challenge is just parsing them. We have total control over the format we use for this. We can put whatever format we want in github or publish something new in https://www.elastic.co/guide/en/fleet/8.15 specifically for Kibana to consume.
The workflow for this should be:
This does require us to design in detail how we want this to work, which hasn't happened yet in this issue.
@cmacknz I do not think scrapping the release note website is really a viable solution, this site and the way the doc is rendered can change, and scraping it seems not a future proof solution.
Using directly raw.githubusercontent could eventually be a solution instead of using a new API, is there no security issue rendering something from an host we do not control, instead of a controlled API? Is there any known limitation to use raw.githubusercontent
as an API? it seems there is some rate limit of 5000 call per hours/ip probably not so problematic for us
I think a better workflow could be something like
With a new API
With Github as a backend
I am not fundamentally opposed to an API, in practice it is better, I am just wondering if we can minimize the number of dependencies we have and things we need to build. If we can't, so be it.
Definitely the rate limit on raw.githubusercontent
would be a problem, so that leaves us with finding a way to avoid building something new via some special page in the docs. If that is roughly equivalent to just pushing new content to a new API there's no point though.
Kibana pick up the latest content (if not air gapped) from a controlled API
What should be the frequency here? Shall this be pulled once a day and the content be stored in memory or in a saved object of some sort? Shall this be done every time the known issue page is loaded?
What should be the frequency here? Shall this be pulled once a day and the content be stored in memory or in a saved object of some sort? Shall this be done every time the known issue page is loaded?
it could be on demand when the know issue page is loaded (probably with a gracefull error handling to no block the whole upgrade flyout), similar to what we do when loading product version.
It's hard to know if the rate limit on GitHub's raw endpoint is going to be problematic. Even with a 24h cache TTL on requests, if all requests from a given ESS region come from the same API we could easily hit a rate limit issue with that endpoint if we have several thousand clusters making requests to fetch this content. It seems safest to me to rely on an endpoint we control here.
One thing we could do to make scraping from docs.elastic.co more viable would be to place the known issues in some kind of predictable/structured piece of data on the docs page that can be scraped separately from the page content itself. That way we aren't depending on an HTML structure that could potentially change. e.g. putting the content into an HTML comment or a data
attribute that we can parse out.
One thing we could do to make scraping from docs.elastic.co more viable would be to place the known issues in some kind of predictable/structured piece of data on the docs page that can be scraped separately from the page content itself. That way we aren't depending on an HTML structure that could potentially change. e.g. putting the content into an HTML comment or a data attribute that we can parse out.
+1 this is what I was originally trying to describe but not wording as well as a lighter way to get what we want. If this can't work that is fine, but let's prove the simpler path doesn't work before building something with more dependencies across the systems involved.
Maybe it would even be possible to include a .json
file that's "published" the docs site (not linked to, ofc) at a predictable URL location e.g. https://www.elastic.co/guide/en/fleet/current/release-notes-8.16.0/_known_issues.json that way we don't even really have to do scraping. Could we hack an "API endpoint" (just a static JSON file) into the docs system this way? Maybe @kilfoyle would know. In theory this should work exactly like hosting a .png
or any other static asset.
+1 if we can publish a static json, it seems it will be a more viable solution than scrapping a website
+1 if we can publish a static json, it seems it will be a more viable solution than scrapping a website
Will the tech writer be in charge of maintaining this json? If yes they probably need to be informed about this.
The known issues have a structured and ideally parseable format, so ideally we would generate it as part of the docs build.
I write a good amount of the known issues myself, I would not want to have to write them twice, or even know that this implementation existed. Regardless of if the implementation is a static JSON file on the docs site or some new API, the person writing a known issue can't be required to do anything special, or it will just get forgotten.
@KOTungseth, @benironside Just FYI, since I know you're working on a new docs-wide process for Release Notes that I believe includes how known issues are handled.
@kilfoyle @KOTungseth, @benironside do you have more information on the new doc wide process process for release notes? will it be possible to have the known issues source living in a predictable place for some automation to consume?
do you have more information on the new doc wide process process for release notes? will it be possible to have the known issues source living in a predictable place for some automation to consume?
@nchaulet, we'll look into it and will reply back here soon.
Have have not landed on a final decision for known issues, but here are my initial thoughts for 9.0.0 and later:
Bug fixes
or simply Fixes
section of the release notes@nchaulet can you validate these assumptions for Fleet known issue automation:
Upgrade agent
flyoutI should also note that the new docs system will use Markdown format, while our current system uses Asciidoc format. If we plan to automate Fleet known issues for the last 2-3 major versions, we'll need automation to account for both formats.
Describe the feature: We had recently had some near catastrophic issues in versions of agent that ideally we could have used the platform to warn against and allow the user to dig deeper in to options available to them. In particular it's good practice to let our users know the caveats they may need to consider when upgrading their agents.
Describe a specific use case for the feature: The use case, in simplest form, is when the user has upgraded their stack to version X and they are embarking on upgrading the agents to version X also. We would like to have a fly-out (or some other suitable design) that would show this user all the known issues we are aware of.
@kpollich @zombieFox
Design Solution
Description: When a customer upgrades one or multiple agents to the latest or newer version, display a Flyout component that includes the version dropdown and the related Known Issues content. The Known Issues content is hidden in an accordion component, closed by default, and it updates the content any time the customer changes the version in the dropdown.
Figma file with the new UI criteria. Reach out to @simosilvestri for implementation support.
New UI criteria
Upgrade multiple agents
Upgrade a single agent