artifacthub / hub

Find, install and publish Cloud Native packages
https://artifacthub.io
Apache License 2.0
1.57k stars 217 forks source link

[feature] Improve filtering to include CNCF project and whether it's open source #1791

Closed caniszczyk closed 1 year ago

caniszczyk commented 2 years ago

I'd like to see a couple new filtering options:

1) if it's a CNCF project or not 2) if it's open source

tegioz commented 1 year ago

Hi @caniszczyk

We've been discussing this a few times and we have some concerns about how this might work or if it could be misleading for users.

Open source filter

AH already allows publishers to provide the license under which their content is published. There is a filter on the packages search page (on the left side column) to filter by license and, when available, it's displayed on the package's view.

However, one of the problems we see with it as it is, and that we could make a bit worse with an open source filter, is that often there is more software behind the artifact itself that may have different licensing terms. Let's take the MongoDB chart below as an example:

mongodb

This chart is provided by Bitnami and released under the Apache 2 license. But the chart in this case is a means to easily install on K8S the underlying product, MongoDB, which has a completely different license. IMHO listing this package as Apache-2 or open source could be misleading to some users. Bitnami does a good job explaining this on the package's README file (see screenshot below), but it could be easily overlooked and other publishers may not include a similar disclaimer.

disclaimer

This is something very common in Helm charts, which can install software from multiple vendors with different licenses. So categorizing them as open source can be tricky.

In addition to this, at the moment less than 10% of the content listed on artifacthub.io provides a license. Most of the time the artifacts are actually using an open source license, but it's not included in the artifact itself, so it's hard to identify it properly. This happens often in Helm charts, where publishers do not include the LICENSE file in the chart. The MongoDB chart mentioned above is actually an example of this.

CNCF project filter

If we were handling this at the package level, what would it mean for a package to match this filter? Should it be "related" to a CNCF project? Published directly by the project itself? In AH, all content with the exception of the official status is provided directly by the publishers, with no manual curation at all. Orgs/users can list their repositories from the control panel and they'll be automatically processed and listed. So at the package level, this new status would need to be provided by the publisher as well, with the risk of not being included when it should (like the license case above), or included when it shouldn't if the criteria wasn't clear.

Another option would be to handle this at the repository kind level. At the moment most of the supported kinds (except Tekton and Container images IIRC) are artifacts from CNCF projects, so filtering by this criteria could be of limited use right now but maybe handy in the future.

caniszczyk commented 1 year ago

re: open source licensing filter, let's hold for now as I see what you are saying

I think what this means is that the project is published DIRECTLY by the project itself.

So something like this would be unofficial (not a CNCF project) https://artifacthub.io/packages/helm/portefaix-hub/crossplane-aws-factory and this a CNCF project thing https://artifacthub.io/packages/helm/crossplane/crossplane

I think we just need a badge that is like another type of verified publisher, this would be a "CNCF Project" badge

On Mon, Jan 23, 2023 at 6:10 AM Sergio Castaño Arteaga < @.***> wrote:

Hi @caniszczyk https://github.com/caniszczyk

We've been discussing this a few times and we have some concerns about how this might work or if it could be misleading for users. Open source filter

AH already allows publishers to provide the license under which their content is published. There is a filter on the packages search page (on the left side column) to filter by license and, when available, it's displayed on the package's view.

However, one of the problems we see with it as it is, and that we could make a bit worse with an open source filter, is that often there is more software behind the artifact itself that may have different licensing terms. Let's take the MongoDB chart below as an example:

[image: mongodb] https://user-images.githubusercontent.com/1213902/214031777-2e0634ea-e68d-4004-8afd-068bbaa579f4.png

This chart is provided by Bitnami and released under the Apache 2 license. But the chart in this case is a means to easily install on K8S the underlying product, MongoDB, which has a completely different license. IMHO listing this package as Apache-2 or open source could be misleading to some users. Bitnami does a good job explaining this on the package's README file (see screenshot below), but it could be easily overlooked and other publishers may not include a similar disclaimer.

[image: disclaimer] https://user-images.githubusercontent.com/1213902/214031813-c982590a-ab78-4d53-bb3b-b92f5d646eb0.png

This is something very common in Helm charts, which can install software from multiple vendors with different licenses. So categorizing them as open source can be tricky.

In addition to this, at the moment less than 10% of the content listed on artifacthub.io provides a license. Most of the time the artifacts are actually using an open source license, but it's not included in the artifact itself, so it's hard to identify it properly. This happens often in Helm charts, where publishers do not include the LICENSE file in the chart. The MongoDB chart mentioned above is actually an example of this. CNCF project filter

If we were handling this at the package level, what would it mean for a package to match this filter? Should it be "related" to a CNCF project? Published directly by the project itself? In AH, all content with the exception of the official status is provided directly by the publishers, with no manual curation at all. Orgs/users can list their repositories from the control panel and they'll be automatically processed and listed. So at the package level, this new status would need to be provided by the publisher as well, with the risk of not being included when it should (like the license case above), or included when it shouldn't if the criteria wasn't clear.

Another option would be to handle this at the repository kind level. At the moment most of the supported kinds (except Tekton and Container images IIRC) are artifacts from CNCF projects, so filtering by this criteria could be of limited use right now but maybe handy in the future.

— Reply to this email directly, view it on GitHub https://github.com/artifacthub/hub/issues/1791#issuecomment-1400241346, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSIJBCMY67TJOFZCGKE3WTZYNBANCNFSM5MH2FOFA . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org

tegioz commented 1 year ago

A CNCF badge would be cool, but we'd need to find a sustainable way to assign and maintain them over time (there are over 2.5k repos listed and growing!). The verified publisher status checks are automated, whereas the official one is handled manually when publishers request it via a templated issue, as it requires some extra verification that cannot be easily automated.

One option would be to require to include "something" in the repository AH metadata file, where "something" could be just the claim of being a CNCF project (not good enough probably) or a piece of data we could actually verify against an external source (something in landscape.yaml maybe?). The second option would be similar to how the verified publisher status works.

If you have a chance please take a look at this Slack thread and this issue. The project involved is Tekton (not a CNCF project IIRC), but the idea would be somehow related, except it'd be a status for the project that created the artifact kind.

caniszczyk commented 1 year ago

I think we have to do this as part of the "official" process and have a special badge for it.

The only other way I could think of automating this is if we somehow could scrape the cncf/landscape.yml file for official projects.

On Thu, Jan 26, 2023 at 7:35 PM Sergio Castaño Arteaga < @.***> wrote:

A CNCF badge would be cool, but we'd need to find a sustainable way to assign and maintain them over time (there are over 2.5k repos listed and growing!). The verified publisher status checks are automated https://artifacthub.io/docs/topics/repositories/#verified-publisher, whereas the official https://artifacthub.io/docs/topics/repositories/#official-status one is handled manually when publishers request it via a templated issue https://github.com/artifacthub/hub/blob/master/.github/ISSUE_TEMPLATE/official-status-request.md, as it requires some extra verification that cannot be easily automated.

One option would be to require to include "something" in the repository AH metadata file https://github.com/artifacthub/hub/blob/3037ddcc6f0e2403dc93ee6ab43c8b26a19b51e5/docs/metadata/artifacthub-repo.yml, where "something" could be just the claim of being a CNCF project (not good enough probably) or a piece of data we could actually verify against an external source (something in landscape.yaml maybe?). The second option would be similar to how the verified publisher status works.

If you have a chance please take a look at this Slack thread https://cloud-native.slack.com/archives/C0103CTR3RB/p1663254938786419?thread_ts=1663252405.495489&cid=C0103CTR3RB and this issue https://github.com/artifacthub/hub/issues/2328#issuecomment-1249021015. The project involved is Tekton (not a CNCF project IIRC), but the idea would be somehow related, except it'd be a status for the project that created the artifact kind.

— Reply to this email directly, view it on GitHub https://github.com/artifacthub/hub/issues/1791#issuecomment-1404818114, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSIM7EFH7JONAKU632FTWUJHO7ANCNFSM5MH2FOFA . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org

tegioz commented 1 year ago

Doing it as part of the "official" process sounds good to me. It may take some time to get all on board, but it'll be sustainable going forward. We are also in the process of updating how the badges are displayed, to encourage publishers to claim the missing ones (like official where applicable) and make them easier to understand.

tegioz commented 1 year ago

This is ready @caniszczyk!

cncf

We may have cases where this new cncf flag can be set but the official does not apply though. But I think this is a good start 🙂

Related PRs: #2808, #2811, #2812

caniszczyk commented 1 year ago

Thank you

On Mon, Feb 27, 2023 at 4:25 AM Sergio Castaño Arteaga < @.***> wrote:

This is ready @caniszczyk https://github.com/caniszczyk!

[image: cncf] https://user-images.githubusercontent.com/1213902/221549919-75d30b5a-3164-40df-9567-6c8faab79542.png

We may have cases where this new cncf flag can be set but the official does not apply though. But I think this is a good start 🙂

Related PRs: #2808 https://github.com/artifacthub/hub/pull/2808, #2811 https://github.com/artifacthub/hub/pull/2811, #2812 https://github.com/artifacthub/hub/pull/2812

— Reply to this email directly, view it on GitHub https://github.com/artifacthub/hub/issues/1791#issuecomment-1446161643, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAPSILM4XH734NEM4TJXF3WZSFJZANCNFSM5MH2FOFA . You are receiving this because you were mentioned.Message ID: @.***>

-- Cheers,

Chris Aniszczyk https://aniszczyk.org