spatie / crawler

An easy to use, powerful crawler implemented in PHP. Can execute Javascript.
https://freek.dev/308-building-a-crawler-in-php
MIT License
2.51k stars 357 forks source link

Support for symfony v4? #333

Closed Defcon0 closed 3 years ago

Defcon0 commented 3 years ago

Hello,

thanks for the great crawler. I'm using it now for a larger project. Unfortunately we can't move to symfony 5 here because the CMS we're using depends on symfony v4.

Was there really a hard reason to drop the support for the currently active LTS version of symfony which is v4? The support of the framework is active until November 2023 (!) and for v5 of this bundle the support is dropped?

Was a version running in symfony v4 not possible? What are the hard benefits of dropping the support? In fact even symfony 3 also is being supported until end of 2021.

It seems the dependency policiy here is quite rigorous. Most professional developers only use LTS versions so symfony v5 is far away from being the standard atm.

Of course we could just use an older version of this plugin but it seems that it has problems with larger websites and therefor we might need to use v5, don't we? (see https://github.com/spatie/crawler/pull/331) I guess such features/bug fixes are not ported down to the old versions, are they?

Thanks in advance for clearification.

Bye Defcon0

freekmurze commented 3 years ago

Hi,

we don't use Symfony 4 for our projects, so it makes no sense for us to spend time on it. Feel free to fork this package and adapt it to your particular needs.

Defcon0 commented 3 years ago

Too bad, since this bundle is very popular and many people outside of your company are using it. Forks aren‘t a solution because then we‘re not on the update track anymore.

I understand your point though.

Redominus commented 3 years ago

Hi @Defcon0 #331 is only a problem if you want to use the crawler as a serverless application(It wasn't conceived that way and that PR is a change in that way). I have crawled sites of 2.5 million unique urls in one go without a problem. And had 40 concurrent crawlers at the same time using less than one processor.

About the symfony version of the dom-cralwer package, @freekmurze, Do we really need v5? There was no change in the used API of that package, just a version update. @Defcon0 have you tried to install the crawler in the new version to check if there are conflicts?

Defcon0 commented 3 years ago

Ah I see, the PR sounded as if there was a problem in the module.

I didn‘t try it, yet. But if the api didn‘t change I guess then the dependency version bump has been changed for no good reason.

We in our company write modules as well and release them to open source community. We always ask ourselves if a bc break is really necessary and brings a large benefit so that it is worth it. If so we them try to keep at least the support for the current lts versions so that a minimum amount of users is excluded.

(Same thing for requiring php v7.4 which also might not be crucially necessary but excludes large amounts of users)

freekmurze commented 3 years ago

Feel free to submit a PR that adds v4 of the Symfony crawler without changing any of the package code. If all our tests pass, I'd consider merging it in.

Defcon0 commented 3 years ago

OK, I'll try to check it but due to time constraints this might take a while.