Open 21Bruce opened 9 months ago
Confirmed that resy is using captcha by analyzing homepages source. Specifically reCaptcha from google. More analysis of the booking page leads me to believe that v3 is in use
Hi! I've been stalking this project for a few months now, and I'd like to possibly contribute by taking a crack at this.
I see a branch has been cut already, but I don't seen any work committed to it yet. Not sure if you've started on this and just having pushed any of your changes, or if that branch is just a placeholder for work to land eventually, but if this is unclaimed I'd be happy to take a stab at it.
Lmk if the above comments are missing any additional information that might be good to know, like a specific idea or direction for how this should be accomplished/implemented, or any other findings that might be relevant.
@tshamz We'd be happy to review any code contributions you have. However, I have a few caveats. The contribution etiquette document is super outdated - I wrote it(when this project was privated) for new people I recruited IRL. Some of the information is relevant, like I'd like solutions that do not use third party libraries and that maintain documentation somewhat, but the stuff about creating branches is not relevant, since not everyone has write access to the source code repo. The way we review code now for non-contributors is via diffs sent in the comments of the github issue(like this one), or pull requests from forks. So, you're welcome to take a stab at any part of the bot and send a diff in the right issue thread(if there isn't an issue thread, you can create one and I'd be happy to review that as well) or make a pr to the right issue branch.
Finally, and hopefully this is not taken as discouragement, but this specific problem is decently complicated for a first issue, and relative to the knowledge required to make the original bot, is incredibly convoluted. It requires some theoretical background in machine learning, specifically enough to understand reinforcement learning, and some more advanced knowledge of HTTP networking. If you do decide to send diffs, perhaps the best strategy would be to pick a problem that can be accomplished with relatively less knowledge, since you haven't done any dev work on the bot yet, and then once you are familiarized with the bot internals, and hopefully the new branch is up to speed, you can judge what you feel is the right level of challenge for you. We will be posting relevant information on the dev process of this issue here, much like issue #7 . In the meantime, if you wanted a list of good first issues, here's a few, in order of importance:
Adding table options. We've gotten a request from a user for this specific feature. We'd want it to be something like a command line option, specifying indoor vs outdoor etc, but this would require effort at the networking layer as well, though I'd imagine not a significantly difficult amount of work.
Simplifying the string manipulation code in api/resy/api.go. Specifically I'd like to see fmt.sprintf used here and what that looks like, I think it'd be way simpler.
Integrating the opentable API with the rest of the app. Currently the opentable API works as a standalone thing, like if you wanted to write a hardcoded program to interact with opentable. The issue is we don't have the right semantics to handle the differences between logins from opentable and resy such that one could integrate this really easy. This isn't super important since we've received no real requests for this from users.
And as a very last note - no issue regarding the bot is really 'claimed' by any group, you can submit work to any issue and we'll look at it. The issue threads that are up right now are outdated, I'll remove a decent amount of them today
Why don't we just use 3rd party vendor for solving recaptcha? Even if we solve it this time, the recaptcha is continuously evolving to anti-bot (especially this is an open source project)
To avoid endless effort to be spent on the task, I would suggest relying on 3rd party solver solution
@chanyk-joseph Yeah,no. I'm not paying monthly for the bot to use a click farm in some third world country so you can get a reservation. As you mention it's open source, so if you want you can pay for a third party captcha solver, clone the repo and add it to the bot for your individual copy...
Is there an existing issue for this feature?
Description of the problem
With the increasing prevalence of bot solutions, resy has started using captcha on higher-demand reservations. If this pattern continues, the critical factor in making reservations will most certainly be the ability to complete a captcha. This is a fantastic blessing in disguise. For one, there are well-documented machine learning algorithms which can classify captchas at an extremely high rate. Second, since most of the bots out there are not actively maintained, they'll be completely useless compared to a bot that handles captcha. Third, and even better, an ML captcha system is probably far faster than the vast majority of people trying to solve a captcha.
Despite these opportunities, implementing captcha accurately is tough. We first have to decipher the networking calls, which is pretty hard since we have to create an event in the browser that we can monitor and study, and these captchas are appearing very infrequently. I'd assume they appear on harder reservations, which makes it harder for us to reproduce and test our implementation. Furthermore, once we have the networking calls down, there are a number of captcha tasks. Some involve typing letters and number, some involve selecting photos, others involve checking a box and then moving the mouse in such a way that google thinks a human is behind the IO. We will need algorithms for these separate functions.
Planned Solution
Add the networking checks and calls to resy's reserve function, create a separate top level package for ML stuff.
Alternatives
None really, maybe somehow displaying the captcha to the user in the terminal, but that seems pathological.
Solution Specifics
There are a few papers online about breaking captchas. For analyzing network calls, we'll start with the common firefox/postman method of breaking and then modify if that doesn't work