berstend / puppeteer-extra

šŸ’Æ Teach puppeteer new tricks through plugins.
https://extra.community
MIT License
6.33k stars 737 forks source link

We now have co-maintainers / State of the project #238

Closed berstend closed 4 years ago

berstend commented 4 years ago

Hi all,

unfortunately I've been unable to maintain this project for the past months, due to unforeseen reasons (more long-form explanation here).

Given the popularity of puppeteer-extra this had quite an impact, with no PRs and updates being merged and published.

Despite things going back to normal on my end I'm therefore looking for contributors who might be interested in taking on a more active role on this project (triaging issues, merging PRs, pushing new versions). Especially stealth & detection evasion techniques are fast-paced and quite the burden to maintain for a single maintainer.

In order to ensure the viability and longevity of puppeteer-extra it's crucial that updates don't hinge on the availability of a single person. Given the subject matter this project also attracts a lot of less experienced developers as well, resulting in a lot of "noise" in the issue tracker that still needs to be managed.

Let me know if you'd be interested in supporting this project more actively, I'll also reach out to contributors that stood out in the past. :)

Thanks!

edit: we now have 6 awesome collaborators who stepped up. šŸŽ‰ If you feel you'd be interested in participating as well just mention it here. :-)


Update: I'm currently doing the following:

brunogaspar commented 4 years ago

I'm interested to help, since i was planing to release a super light version with mainly the stealth techniques and some minor improvements, but, since the intention is to move forward here, i don't see the need to have a separate package.

But either way, having the project moving forward would be quite nice :)

Thanks!

itsdarrylnorris commented 4 years ago

I am interested on co-maintaining.

andrew-healey commented 4 years ago

I'm also interested, mostly because I want to ensure that puppeteer-extra-plugin-stealth evasions are added as quickly as bot detection software adds new detection methods (to prevent #227 from happening again).

prescience-data commented 4 years ago

It's definitely become a critical package for a lot of people, so having small, experienced team vs single person risk will be incredible and really add to the pace of improvements!

jozsi commented 4 years ago

I'm interested if we make it playwright compatible - as I am moving slowly away from puppeteer.

LE: Apparently there is an issue (#144) for this and @brunogaspar made a fork šŸŽ‰

Niek commented 4 years ago

I'd also be interested in helping out. Getting up-to-date with Puppeteer 5 is most important right now.

I started a similar package for Playwright (https://github.com/Niek/playwright-addons) but if this project is active again it might make sense to add Playwright support in this repo.

alex-yliu commented 4 years ago

I'm interested in helping out. Count me in

berstend commented 4 years ago

Awesome, thanks a lot for the interest and re-assuring feedback - much appreciated. :-)

I'm getting up to speed with the current state of things (e.g. playwright) this weekend and start making adjustments. Given that we can lock PR merging to require 2 approvals I don't see an issue inviting devs who haven't contributed code before.

@Niek - great to see you're interested :-)

I will update this ticket with a public list of next steps/todos once I have an overview.

Just to clarify: I'll still be active and involved in the project, given the speed of progress required in this space it's just the right time to open things up.

Looking forward to the massive updates that'll follow. šŸš€

moltar commented 4 years ago

Iā€™m interested as well. Can help with the CI setup also. Have a lot of experience with GH Actions. We can setup Semantic Release to auto publish to NPM.

berstend commented 4 years ago

Iā€™m interested as well. Can help with the CI setup also. Have a lot of experience with GH Actions. We can setup Semantic Release to auto publish to NPM.

Sounds good @moltar - definitely makes sense to migrate to GH actions. Being a monorepo using lerna makes this project a bit special but I had a quick look yesterday and found this writeup that uses changesets to create a publishing PR. It looks like the changeset generation needs to be triggered locally but I like that publishing to npm is conscious effort by merging the changeset PR.

What do you think of this approach? :-)

berstend commented 4 years ago

FYI: I've started adding collaborators, currently: @Niek @brunogaspar @jozsi @moltar @itsdarrylnorris @Sesamestrong

@YunfengLiu I haven't added you as a collaborator yet as your GH profile looked a bit empty, thanks for stepping up and feel free to participate in the project though. :-)

The master branch is protected and requires PRs with 2 reviews (tbd) to be merged into.

The immediate next steps from my side:

Let me know if I forgot something important. :-)

berstend commented 4 years ago

New Puppeteer 5 compatible versions (still backwards compatible) are now on npm. Had to fight Travis a bit to get the tests green and look forward to replacing all that with a new GH actions based flow with auto-publishing. šŸ˜„

I feel setting up a robust new CI pipeline is one of the most important next steps as it's affecting all future work & collaboration.

Unsorted things I've additionally noticed during catching up:

If anyone is interested in spearheading one of these efforts or has other things just mention it. :-)

Niek commented 4 years ago

Regarding dependency management: I have good experience with dependabot, it's part of GitHub now and very well integrated. If you want i'll make the PR. Edit: oops, I see you're already using it, but as an external service (not in .github/dependabot.yml). Might make sense to move it internally.

And totally agree on GH actions: for stuff like docs on packaging it's really nice.

berstend commented 4 years ago

I see you're already using it, but as an external service (not in .github/dependabot.yml). Might make sense to move it internally.

Yeah, I tried setting it up a while ago but ran into some issues (monorepo, CI not being reliable) and left it there as other things came up. :-) Back then renovate bot was supposedly suited better for monorepos, but that might've changed since then.

I've just removed the Dependabot and Renovate bot app integrations to have a clean slate, feel free to add proper support for either one of them (dependabot looks like a winner though, due to 1st class GH integration). :-)

berstend commented 4 years ago

Thanks for reviewing/merging in PRs @Niek @brunogaspar šŸ‘

Until we've moved to Github actions I'll publish the new package versions manually.

Everything in master (up to https://github.com/berstend/puppeteer-extra/commit/dc8b90260a927c0c66c4585c5a56092ea9c35049, #165) is now live:

Successfully published:
 - puppeteer-extra-plugin-adblocker@2.11.5
 - puppeteer-extra-plugin-anonymize-ua@2.2.11
 - puppeteer-extra-plugin-block-resources@2.2.6
 - puppeteer-extra-plugin-click-and-wait@2.2.6
 - puppeteer-extra-plugin-devtools@2.2.11
 - puppeteer-extra-plugin-flash@2.2.7
 - puppeteer-extra-plugin-font-size@2.2.6
 - puppeteer-extra-plugin-recaptcha@3.1.13
 - puppeteer-extra-plugin-repl@2.2.6
 - puppeteer-extra-plugin-stealth@2.4.12
 - puppeteer-extra-plugin-user-data-dir@2.2.6
 - puppeteer-extra-plugin-user-preferences@2.2.6
 - puppeteer-extra-plugin@3.1.6
 - puppeteer-extra@3.1.12

edit, a few more changes went live.

ghost commented 4 years ago

Another bit of food for thought - A lot of people seem to be switching to Brave browser which allows for "Tips" https://brave.com/tips/ Would be great to be able to Tip the project! Either that or a Patreon etc?

evading-bot-detection commented 4 years ago

This is great news, looking forward to the updates.

xD-saleem commented 4 years ago

I'm interested in helping out and help maintaining. Count me in

momala454 commented 4 years ago

you still don't want to make a private location to discuss ? i don't really want to give more infos to the opposite ^^

brunogaspar commented 4 years ago

Why not make a private discord server and invite us for now? That way is easier to share this stuff and also discuss other improvements or changes before attempting any pull requests.

berstend commented 4 years ago

Why not make a private discord server and invite us for now? That way is easier to share this stuff and also discuss other improvements or changes before attempting any pull requests.

I like the idea of it, let's see if there's enough interest for it. Will set up a discord over the weekend, not sure how to best send out invites though (Github really needs a messaging feature) šŸ¤”

edit: Actually we can just make it public with a private channel, lol

brunogaspar commented 4 years ago

edit: Actually we can just make it public with a private channel, lol

Yes, that would work :)

itsdarrylnorris commented 4 years ago

I love the idea of the discord server.

jozsi commented 4 years ago

I'd go with Slack instead.

berstend commented 4 years ago

I'd go with Slack instead.

Yeah, I was wondering about that as well. Personally I have a slight preference towards Slack (due to being more involved in existing Slack channels already). :-)

I was trying to figure out what's more common in the (open source) community and the results are bit inconclusive šŸ˜„

Also It could make sense to set up a more generic "browser automation/scraping enthusiasts" chat to discuss things (with internal channels for our needs), as I'm not sure how long-lived the puppeteer-extra project name will be (with future playwright support on the horizon, etc). :-)

moltar commented 4 years ago

I'd say Discord channels are becoming more popular these days, especially for the frontend stuff.

brunogaspar commented 4 years ago

I use both, but the main reason for suggesting Discord is the trend of open source projects and communities moving to it from Slack.

But either way works fine. Unless someone wants to use IRC or perhaps Skype? šŸ˜…

itsdarrylnorris commented 4 years ago

I ended up creating a generic Discord server for web scraping. I have also created a public channel for "puppeteer-extra," so we can chat about puppeteer-extra. If needed, I can make the puppeteer-extra channel private but is public for now.

moltar commented 4 years ago

I've joined, but I am not seeing any rooms.

moltar commented 4 years ago

just "welcome"

itsdarrylnorris commented 4 years ago

Hey @moltar ,

The welcome room should have the instructions on how to give you access to the rest of the rooms. You need to accept the guidelines.

berstend commented 4 years ago

@itsdarrylnorris yay to duplicate efforts šŸ˜„

I've set up a discord here: https://discord.gg/vz7PeKk

We need private channels and a certain structure to get the most use from this. :-) Ping me to get a Maintainer or Insider roles (to discuss things more privately).

Edit: To be added to the private Insider discussion channels we require the user to prove their Github username ownership (must be a user who's been active in our repo before) - just ping any Moderator who's online for that :-)

berstend commented 4 years ago

I'm closing this issue, as we're pushing progress again with full steam now. :-)

Since my initial call for support the people who offered to help did triage & close issues, reviewed & created pull requests, participated in community discussions and worked on new things and fixes - thanks a lot šŸ‘

In addition the community discord is quite active and serves it's purpose.

We also shipped a couple of updates and have large rewrites & improvements in the works (#291, #292) as well as internal projects.

My local playwright-extra proof-of-concept takes shape as well, so the future looks bright šŸ˜„