sherlock-project / sherlock

Hunt down social media accounts by username across social networks
https://sherlockproject.xyz
MIT License
59.38k stars 6.81k forks source link

improve user journey for scripting #2121

Open a-wallen opened 4 months ago

a-wallen commented 4 months ago

Checklist

Description

Sherlock is great from the CLI, but not as much trying to write a script using it, here's why.

  1. The documentation does not have clear instructions for calling the sherlock function with it's required arguments. Here's an example. The instructions do not clearly state how to provide a default argument, stating how to find the class that needs to be implemented.
  2. Just like the CLI tool, the sherlock function should be able to work with just the username argument. For example: the site_data could be defaulted to data.json.
  3. I don't want to suggest that the query_notifier parameter gets defaulted. It's fine the way that it is. However, there should be a separate implementation called QueryNotifierMap where you can access result as a map. QueryNotifyPrint is for the CLI, developers want to call this with just an object.
matheusfelipeog commented 4 months ago

Hi @a-wallen, thanks for opening this issue ;)

Sherlock is essentially a tool focused on being used via CLI, which is why there isn't a more comprehensive documentation of its internal components.

I personally like the idea of providing clear and documented ways to integrate Sherlock into other projects; that's actually a good idea. We can improve on that over time, although it's not one of our priorities right now.

a-wallen commented 4 months ago

@matheusfelipeog I was planning on submitting the following PRs

Hit reply to this an just "X" the ones that you want me to do.

ppfeister commented 4 months ago

While waiting for @matheusfelipeog or @sdushantha... wanted to add:

I don't believe that this documentation should be in the readme. That doc can become easily cluttered with material that doesn't apply to 90% of users. I would suggest /docs/plumbing/${stuff}.md, which can be linked to via the readme. For a basic example of what I mean, I've done something similar with the Installation section of the new readme, linking to /docs/install.md. Just keeps things tidy.

Documentation within the code is highly valued as well, as you mention, since external documentation can easily become out of date. Aside from docstrings, type suggestion for arguments and returns would be nice to have more of.

Would like to see if the others agree.

ppfeister commented 4 months ago

Sherlock is packaged in several places and isn't likely to be ran as a single script --- it's more likely that people will install the package itself via pip or similar

When using the package, people aren't calling the sherlock function. They're calling the sherlock package. In order to call the sherlock function it'd have to be something like

from sherlock import sherlock # module within package
sherlock.sherlock(usernames) # function within module

quite verbose when

import sherlock # whole package
sherlock(usernames) # using package entry point

is probably preferred in 99% of cases.

I believe main() is the current package entry point. main() should currently take a ton of parsed sys args rather than function args. Would you be able to somehow adapt it so the entry point accepts both command line args and normal scripted function args? How difficult do you think that adaptation would be?

a-wallen commented 4 months ago

@ppfeister yup you're on point: this is what I think you mean https://github.com/sherlock-project/sherlock/pull/2143