Open tmarkley opened 1 year ago
First thing first, how is it going @tmarkley :)
My $0.02 :
I won't call a resolution
lazy but more a quick (or maybe the quickest) way to mitigate CVEs.
(ref : https://classic.yarnpkg.com/lang/en/docs/selective-version-resolutions/)
And yes, if time and capacity allow (as always), a deeper dive to the dependency would be more ideal in order to resolve the issue from the root, where rewriting or refactoring may be required, painful, but the right thing to do.
Actually to take a step further, it would make more senses to not just relying on a scanning tool telling us to bump which package to which version, but dig into the code level as why opensearch is needing that package and to which (line of code) is that package dragged into the picture. Most of the time we may just find OSD is not even consuming the vulnerable methods in the dependency package and we are just bumping the version to mute the advisories..
Speaking of the balance of tech debts and new features, both are important, true. Do we have enough resources to tackle down them all ? I doubt. Personal opinion I would take paying tech debts as another sort of adding new features, taking https://github.com/opensearch-project/OpenSearch-Dashboards/issues/1118 as an example. Webpack upgrade is necessary and definitely an exciting area to explore. Yet do we have enough resources to get it done ? Or how soon would it be accomplished ? Do we just put one dedicated resource on the hook or break it down into pieces and then eveyone gets a share ? Will leave this to the maintainer groups ~~ (and you would know this better than me since this one was opened by you like a year ago lol)
Last to the Suggestions, as an occasional (but super enthusiastic) contributor, I'd love to see all them happen into reality, yet still, depends on how many resources would be dedicated on / how long it would take, in a Thanos snap or you have to wait for all Avengers to assemble first lol
In the end, IMHO, the maintainers group has been dedicated making the whole project better and better, I feel it. And I do see great values in this very proposal. I believe and trust the maintainers group will have a robust and reasonable triage plan put together and looking forward to contribute and be part of it :)
Thanks @tmarkley for your thoughtful comments and for calling out the must have technical debt issues. The CVE impacting ones are most concerning and I think we ought to find a way to action immediately - I'd like us to get to a place where Renovate/Dependabot can get us to the same state of automatic keeping-up-to-date of dependencies (as OpenSearch core). Is there one issue that captures that? I think I don't understand the mechanics here of how we can achieve that, help me out?
@ZilongX things are going well, thanks! :) It's good to hear from you!
I won't call a resolution lazy but more a quick (or maybe the quickest) way to mitigate CVEs.
Definitely agree that "lazy" was the wrong word choice here. I guess I feel passionately about this because I put so much effort into the approach to resolving CVEs and handling security concerns in the first two major releases of Dashboards.
Oftentimes you can use yarn upgrade
or dive deeper and remove the existing entry(ies) from the yarn.lock
file to fetch the most recent version(s) of a library. I'm going to create a separate issue and work on writing up a more comprehensive guide to handling these different scenarios. Yarn has some good documentation but Dashboards is using version 1 which is many years old, and the newer versions (v3 is actively maintained and v4 is being developed) offer features that could help with dependency resolution.
... dig into the code level as why opensearch is needing that package and to which (line of code) is that package dragged into the picture. Most of the time we may just find OSD is not even consuming the vulnerable methods in the dependency package and we are just bumping the version to mute the advisories..
Absolutely! This is my second recommendation under Immediate Surface-Level Changes
.
Speaking of the balance of tech debts and new features, both are important, true. Do we have enough resources to tackle down them all ? I doubt.
I disagree that this is a matter of resources, which is why the main point of my concerns is the current way that work is prioritized. As far as I understand, Security is one of the top tenets of the OpenSearch Dashboards project. That alone should help guide the maintainers towards what is most important.
In terms of how that work should be done, I think that depends on the impact, severity, scope, and the current technical understanding of the maintainers. I think rallying around the larger and more important (in terms of impact on future development and security risks) initiatives such as the Webpack upgrade, removing AngularJS, and the TypeScript ugprade will not only help the maintainers understand the importance of this work but will also help get the solutions across the finish line.
I'm certainly not saying the resolving technical debt is the only work that the maintainers must be working on, but I don't see enough of a focus on it, or a clear plan to resolve it, currently.
... I'd love to see all them happen into reality, yet still, depends on how many resources would be dedicated on / how long it would take ...
That's also related to these concerns. A lot of this work hasn't been broken down sufficiently which leads to the feeling of each of these issues being insurmountable or too much to handle at once. I know that feeling and it's not fun, but this is another reason why I think the maintainers need to rally around proper prioritization and triage for current technical debt.
In the end, IMHO, the maintainers group has been dedicated making the whole project better and better, I feel it.
I agree! :) A lot of great things have been accomplished since the initial fork. I want this project to succeed which is why I started this dialogue. Even with the limited number of contributors I think it's certainly possible to create a roadmap that addresses these current deficiencies while still working towards the larger goals of the project.
@tmarkley nice to see that you are still involved in the project! I think I agree with all the flaws and suggested solutions you have pointed out, especially that surrounding docs. We can't fix what we don't know. I also do know that we are actually prioritising it now since it directly contributes to the flywheel of users and contributors.
The security thing caught me off guard though and will need to take a look to see how else we could have handled it. I know that upgrading and removing dependencies is necessary for the long term health of the project, but we are also a little crippled because we have plugins who take our first party dependencies as their own, which makes changing them a breaking change. My thoughts are that the extensions project should make this easier to do in the future, but I'd love to hear your thoughts on it.
The "guidelines" introduced in https://github.com/opensearch-project/OpenSearch-Dashboards/pull/2674 are insufficient and are contributing to these poor practices.
Update the SECURITY.md document to lay out a more thorough approach to address CVEs and upgrade dependencies. As painful/boring/uninteresting as it might be, it's best to address any required refactoring and/or upstream upgrades immediately to slow the bleeding of outdated dependencies across the project.
Since you've had a lot of experience fixing CVEs in this project, a PR to improve the steps in or to correct mistakes in SECURITY.md
would be super helpful.
Create a list of current direct dependencies in the package.json and understand how/why each of them is used. What can be simplified? What isn't necessary? What can be upgraded to avoid an influx of CVEs in the near future?
I like the idea in theory, but from an impact perspective, I'm not sure I agree that this is the most pressing knowledge gap in the project. Understanding the fundamental functionality of dashboards, the architecture and abstractions, and how those interact, is still barely understood by the maintainer team (see https://github.com/opensearch-project/OpenSearch-Dashboards/issues/2132). Without that foundational knowledge, it's hard to make informed recommendations for rewriting or refactoring non-trivial dependencies. So rather than aim for comprehensiveness (trying to map out every direct dependency first), I'd be more in favor of looking for particular dependencies that have high impact or risk, and that can be tackled in an isolated manner. For example, removing the dependency on lodash
will be tedious, but the solution and approach is well understood.
Start working on documentation improvements projects now. I think all of the current developer documents, READMEs, and runbooks need some TLC.
I don't disagree with the sentiment, but the fact that you don't perceive the massive amount of time and effort that is already going into documentation projects indicates that there's no silver bullet, and that even steady, incremental process feels unsatisfying or insufficient. A few examples of our efforts in this direction (far from comprehensive):
Research and documentation tasks are also one of the foundational workstreams of the visualization framework unification project: https://github.com/opensearch-project/OpenSearch-Dashboards/issues/2584. But sometimes the best way to research and discover how existing features work (a prerequisite to doc writing) is to try to use them in service of a new feature (such as Vis Builder, or Multi-Datasource projects).
Concerns
CVEs/Dependencies
I've noticed that recent CVEs are addressed by haphazardly adding a manual dependency resolution to the
package.json
.Examples:
The "guidelines" introduced in https://github.com/opensearch-project/OpenSearch-Dashboards/pull/2674 are insufficient and are contributing to these poor practices. In some cases (i.e. the resolution for
d3-color
), there is no path forward without significant refactoring or rewriting. However, I don't see issues being opened to address that work. That work is currently being punted until it causes more problems.This is unsustainable and jeopardizes the future velocity of this project. There are already too many dependencies scattered across Dashboards, sub-packages, and plugins, and temporary band-aids make the problems worse. See https://github.com/opensearch-project/OpenSearch-Dashboards/issues/1298 for more details. It's often the case that if you dive into the upstream dependencies of a library that needs to be upgraded you'll find a path to success.
Technical Debt and Prioritization
Diving into these dependency problems (as well as a fresh perspective after taking a few months off) led me to take a step back and look at the broader approach to development and changes in the OpenSearch Dashboards project.
From a very high level I think new features and improvements (e.g. Multiple Data Sources, Visualization Builder, Extensions), especially those targeted at AWS use cases, are being prioritized over important technical debt projects such as #2875, #1660, #1560, https://github.com/opensearch-project/OpenSearch-Dashboards/issues/1060, https://github.com/opensearch-project/OpenSearch-Dashboards/issues/1625, https://github.com/opensearch-project/OpenSearch-Dashboards/issues/2032, https://github.com/opensearch-project/OpenSearch-Dashboards/issues/1187, and many more while the group of active maintainers and community members is not large (about a dozen by my count?).
Important upcoming initiatives such as #2929 are going to require a lot of this technical debt to be addressed. As we've learned in the past, these projects always require more effort than initially estimated.
Summary
Overall I would say that the state of the code base (structure/design, quality, test quality, test coverage, etc.), lack of sufficient documentation, and potential security concerns caused by the current dependency tree should lead the majority of contributors on OpenSearch Dashboards to focus on improving the health of the project. The community might not initially understand the importance of this work, but I'm sure you fine folks at AWS can write up a tidy document to explain that. :)
I haven't been involved in the community meetings and conversations for a while, but I'd like to know if the other maintainers agree with these sentiments? Do you think the current approach to development on this project is sustainable? Is anything being done to address these deficiencies that I'm not considering?
Suggestions
These topics can certainly be split up into multiple issues as needed, but these are my current ideas:
Immediate Surface-Level Changes
package.json
and understand how/why each of them is used. What can be simplified? What isn't necessary? What can be upgraded to avoid an influx of CVEs in the near future? After these questions are addressed, focus on automating the upgrade process (i.e. the first part of #2189).Longer-term changes