anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.06k stars 561 forks source link

Conda ecosystem support #932

Open rigzba21 opened 2 years ago

rigzba21 commented 2 years ago

Hi all!

What would you like to be added: Conda ecosystem (language agnostic) support

Why is this needed: The conda ecosystem provides an amazing way to work with dependencies in terms of environments and is used heavily in the Data Science and Scientific Computing communities.

Additional context: I'm happy to submit a PR for this feature/capability and looking forward to diving into the syft src!

rigzba21 commented 2 years ago

I'll be opening a Draft PR when I have more done in my syft fork for conda support

spiffcs commented 2 years ago

Thanks @rigzba21 for the attention here!

Admittedly I don't have a lot of experience with the Conda ecosystem, but I'm excited to see the PR and learn more about the different constructs a cataloger might have to interact with to provide support.

Feel free to tag the tools team on the PR and we can help with getting the builds kicked off for CI. I've also tentatively added this as a topic to our next community meeting on April 14th so we have a forum to talk more about it as the feature is developed.

Link to the community meeting where we discuss new ideas/features for syft/grype: https://anchorecommunity.slack.com/archives/C4PJFNEEM/p1648734651797149

westonsteimel commented 2 years ago

Yep, thanks @rigzba21, I had thought a bit about trying this before but never got around to it. One thing that will be interesting here is syft already identifies packages installed by conda as pip packages because they end up with the same METADATA entries per https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/python/parse_wheel_egg_metadata.go. There is extra json for conda installed stuff that isn't taken into account though. Anyways, ths one will definitely be interesting and a great addition I think!

rigzba21 commented 2 years ago

@spiffcs, I've joined the Slack channel and added the community meeting to my calendar. @westonsteimel, thank you for pointing me to a good starting point!

rigzba21 commented 2 years ago

Closing my old draft PR which attempted to add a separate conda cataloger, in favor of the smaller-scoped work of extending the python cataloger as discussed in the last community meeting notes

Conda Support

ssthom commented 1 year ago

Is Conda support still being looked at? In our images we have Syft only finding openssl that the base image had installed and not the one in the Conda custom environment

spark@1838268b8f44:/usr/src/app$ /customenv/bin/openssl version
OpenSSL 3.0.5 5 Jul 2022 (Library: OpenSSL 3.0.5 5 Jul 2022)
spark@1838268b8f44:/usr/src/app$ apt list openssl
Listing... Done
openssl/stable,stable-security,now 1.1.1n-0+deb11u3 amd64 [installed,automatic]
spark@1838268b8f44:/usr/src/app$

Syft only finds:

openssl                                                         1.1.1n-0+deb11u3                           deb
rigzba21 commented 1 year ago

Is Conda support still being looked at?

While I'd love to keep working on this (admittedly, I have not looked at this in a while), I don't currently have the time to focus on this.

rigzba21 commented 1 year ago

I'll have time to pick this back up again this quarter 😄.

razzlestorm commented 1 year ago

I'd be interested in helping get this across the finish line, if another body would help. I should have some time coming up in early June that I can work on this, if this is still happening? Is there a list of specific things that still need to be fleshed out and finished?

rigzba21 commented 1 year ago

Hi @razzlestorm! When I started this a while ago, I was looking at potentially adding a new cataloger . I also see that this issue conveniently has a new label, "new-cataloger." 😄

I'd love to collaborate on this now that I have more free time.

kzantow commented 1 year ago

FWIW -- it looks like there were some open questions from the April 14 community meeting: https://docs.google.com/document/d/1ZtSAa6fj2a6KRWviTn3WoJm09edvrNUp4Iz_dOjjyY8/edit#heading=h.3ybu7ev6q4bx

spiffcs commented 1 year ago

@razzlestorm @rigzba21 Thanks for keeping this thread alive!

If you have cycles this summer for looking at this cataloger then absolutely we would love the help:

The document Keith linked should have our OSS meeting on it - we have one this Thursday if you want to come hang out and talk about the scope of what would be a good first pass. Otherwise happy to talk async on this thread to get the ball rolling =)

razzlestorm commented 1 year ago

Definitely! I'll come hang out.

EDIT: Unfortunately, I won't be able to make this, sorry! Apparently a last-minute meeting has been scheduled that I need to attend and the only time slot is that time. I've marked these slots as already being booked but hey, what can you do? I'm happy to coordinate async after your meeting, and make the next one (it's already booked in my calendar and hopefully nobody will unbook it).

In general I've had a brief chance to look over things, and it does seem like it makes sense to extend the python cataloger rather than creating a new one just for .conda packages. I'll take a look at everything in a bit more in-depth over the long weekend here.

rigzba21 commented 11 months ago

I am linking Contributing New Catalogers - A Guide for Hacktoberfest and Beyond. Thank you, @spiffcs, for the writeup! I have some time this Nov/Dec to pick back up and look at this again.

jrandall commented 1 month ago

Closing my old draft PR which attempted to add a separate conda cataloger, in favor of the smaller-scoped work of extending the python cataloger as discussed in the last community meeting notes

Conda Support

  • [ ] extend the python cataloger to include conda-meta/*.json files

I'm new to this project, but I don't see why conda would be part of the python cataloger. Conda can be used for python packages, but it is really a user-space alternative to OS-level packaging and it is primarily useful because it can manage shared libraries and binaries alongside things like Python or R packages that depend on them.

witchcraze commented 1 week ago

Recently, I got involved in Anaconda license topic, so I am checking it. One big factor is if systems install/update packages from the official public repository or not.

So, if we can use Syft to get the list of packages managed by Conda and where they come from (like with conda list --show-channel-urls or conda list --explicit), it will be much easier to find who is causing the problem.