sherlock-project / sherlock

Hunt down social media accounts by username across social networks
https://sherlockproject.xyz
MIT License
59.9k stars 6.9k forks source link

Do we really need to package Sherlock for various platforms? #2122

Open matheusfelipeog opened 5 months ago

matheusfelipeog commented 5 months ago

With the recent efforts to package Sherlock, this question came to mind:

Do we really need to package Sherlock for various platforms?

I personally don't think it's necessary for a few reasons:

I believe that any platform that has some way to install Python along with Pip can theoretically install the Sherlock project without any problems. Publishing it only on PyPI seems to be the best path.

This would also be easier to maintain, due to having a single point of origin.

Sherlock doesn't seem like the type of project that needs specific packaging for each platform at the moment. With the exception of installation on Termux environments, which apparently has some issues installing pandas; I talk more about this here: https://github.com/sherlock-project/sherlock/issues/1945#issuecomment-1850757406

@sdushantha @ppfeister What do you think of these thoughts?

ppfeister commented 5 months ago

Pretty good question

I think we all agree on PyPI, so I'll skip that one


When it comes to DockerHub, currently, users who want docker one offs for some sort of workflow are required to pull and regularly maintain the source. Having the image available on DockerHub and/or ghcr is a pretty huge boon for people who containerize (including myself).

When it comes to maintenance... we already have a Dockerfile, and that's all that was really needed. Whenever there's an update, it's just a quick docker build && docker push, which can be workflow-ed as well.


So, is packaging elsewhere required? Absolutely not. It's more of an nicety than anything else. For myself, I tend to prefer native binaries when able, but again, that's only a preference. When I can find a package via dnf/apt/yum/etc, I'm somewhat more likely to use that specific package than if I had to use some other package manager. As a user, being able to run a quick ${pkg manager} update is much more pleasant than apt update && flatpak update && pip update && snap update && ... since everything wants to be on a different package manager nowadays.

When it comes to upkeep, it's not zero, but it's really not that much. When a new version is ready, I just bump the version number in the spec and press build. The actual upkeep is pretty minimal, and the only thing that's likely to break that process is if there was a massive overhaul of Sherlock that added a few dozen niche and not-yet-present-on-distro dependencies.

sdushantha commented 5 months ago

I agree that maintaining packages outside of PyPi and DockerHub is unnecessary. I suggest that we only maintainin PyPi and Docker files. Many tools I use on my Arch Linux system are installed from the Arch User Repository. However, not all of these tools have the necessary files, such as PKGBUILD, in the main repository. Instead, someone else maintains them in their own repository

Currently we have the files needed for packaging Sherlock to Fedora. Is it possible to have this in a repo you own and maintain @ppfeister? You could have a cron job running each day to check for newer versions and then update the spec file so that the Fedora package is updated

p-linnane commented 5 months ago

You don't need to handle all the packaging yourself. I'm a maintainer for Homebrew and the vast majority of our packages are community supported.

I'm creating a package for Sherlock right now (based off the PyPI sdist), and stumbled on this thread due to that.

ppfeister commented 5 months ago

@sdushantha Can do!

There isn't any real necessity to have the spec here, it's more so just for convenience, should anyone else want to make some tweaks. I'll make the switch.

I would like to resume the convo about switching from setuptools to poetry, when able. I'm not sure why poetry didn't seem to work but I feel like I must have been doing something wrong with it. Poetry would probably simplify packaging all around (pypi included) if it works as I've heard.


@p-linnane Great timing

That's often how it goes. I'm not sure of many projects of this size that have full coverage without community support.

Main reason I'm taking care of Fedora + EPEL is because I'm a Fedora and rhel user myself. I wouldn't dare try to maintain a homebrew package, or an official Arch one, for instance.

Out of curiosity, what's your targeted pkg name? Seems there's a sherlock already on homebrew.

For the rpm, the package name is expected to be sherlock-project, due to required parity with PyPI (where a sherlock also already existed). The Dockerhub image is simply sherlock/sherlock, though.

p-linnane commented 5 months ago

It will be named sherlock. You can view the PR here: https://github.com/Homebrew/homebrew-core/pull/171701. Without getting into too many details, the existing sherlock is a different type of distribution which will not conflict with this one. Once the PR is merged, brew install sherlock will install this project.

matheusfelipeog commented 5 months ago

Thank you for the responses, guys.

I'm creating a package for Sherlock right now (based off the PyPI sdist), and stumbled on this thread due to that.

And thank you for working on this packaging, @p-linnane, we really appreciate it. I think community involvement with this is the best way forward.

I would like to resume the convo about switching from setuptools to poetry, when able. I'm not sure why poetry didn't seem to work but I feel like I must have been doing something wrong with it. Poetry would probably simplify packaging all around (pypi included) if it works as I've heard.

@ppfeister, sorry for the delay in reviewing your PR, I've been a bit busy lately. I think I'll be able to review it more carefully later today.

ppfeister commented 5 months ago

@p-linnane Well that's convenient No idea how that works on homebrew but it makes everything pretty simple Does it go live once that pr is merged?

@matheusfelipeog If you're busy there isn't any rush! It's just a discussion worth having at some point when everyone is able. But there isn't any immediate need. Everything works as it is today, just a planned improvement.

p-linnane commented 5 months ago

Without getting too in the weeds with how we do things...we distribute software in two ways. One is 'formulae'. These are open source projects that we build from source and produce a binary for (we call this a bottle). That's what this sherlock will be. The other is a 'cask'. This is just a pointer to a precompiled binary distributed from an upstream. Often this is for proprietary or otherwise closed source software.

Once the PR is merged, it will take roughly 15 minutes for it to be live. We regenerate the API we serve everything over every 15 minutes, so the first time it regenerates after being merged it will be ready to go.

ppfeister commented 5 months ago

That explanation actually makes a lot of sense. I had no idea what the difference between casks and bottles were, only that they existed

Subscribed to the pr.

p-linnane commented 5 months ago

The Homebrew PR has been merged, and is now live. Any Homebrew user can now run brew install sherlock to grab this package. You may be interested in adding those instructions to your README, but I leave that up to you all.

ppfeister commented 5 months ago

Already a part #2119!

@p-linnane --- Seems that updates are fairly automatic with autobump. Would dependency changes get in the way and require manual review, or is that pretty well automated as well?

p-linnane commented 5 months ago

Our autobump workflow checks for a new release on PyPI every few hours. It will automatically handle dependency changes as long as they can be resolved. If the build breaks, a Homebrew maintainer will debug and reach out here if it's something we can't fix.

mjsir911 commented 5 months ago

As a gentoo (proxy) maintainer too, I'm interested in packaging sherlock at least in GURU to start with. Related is #1120 since having a release tarball to download on the github would be useful (can download from pypi, but my preference is as close to the source as possible).

No effort needed from your end on the gentoo side of things though! Typically packages are maintained by different people than the original package authors unless they so choose to wear both hats.

ppfeister commented 5 months ago

@mjsir911 Glad to see these popping up all of a sudden.

I just brought tags up again, so pretty good timing. My rationale was similar -- packaging as an rpm. It'd sure be nice to see.

Are you able to utilize one of the standard treeish tarballs? For instance, my rpm spec uses this method with master as the ref, pulling whatever the current HEAD is

https://github.com/sherlock-project/sherlock/archive/refs/heads/master.tar.gz

You can use a point in time/commitish ref as well, i.e.

https://github.com/sherlock-project/sherlock/archive/0ecb496ae91bc36476e3e6800aa3928c5dcd82f8.tar.gz

While not \~perfect\~, it can be automated pretty easily as well.


Related ---- it seems that GURU is the Gentoo equivalent of the AUR. Would that be right? Would releases on Gentoo be feature locked for a while, as with deb, or are they readily updateable?

mjsir911 commented 5 months ago

Are you able to utilize one of the standard treeish tarballs? For instance, my rpm spec uses this method with master as the ref, pulling whatever the current HEAD is

Yes this is doable, gentoo has "live" ebuilds too based on the latest git clone. Tags are just preferrable.

it seems that GURU is the Gentoo equivalent of the AUR. Would that be right?

yes, more or less.

Would releases on Gentoo be feature locked for a while, as with deb, or are they readily updateable?

Gentoo makes a distinction between stable and unstable, both easily available but unstable are opt-in while stable haven't shown to have issues for a while.

mjsir911 commented 5 months ago

Other info that might be useful to document somewhere / in the pyproject.toml file: what versions of python is this supported to be installed under?

ppfeister commented 5 months ago

@mjsir911 so, #2111 switches to Poetry and #2127 switches to pytest and tox (blocked by 2111)

Assuming they are merged (since nothing is guaranteed), the actively tested and supported versions would be 3.8-3.12 with regression tests ran against Ubuntu, Windows, and Mac (github actions matrix)

I'll make sure to add that to the documentation somewhere in those PRs