issues
search
crwlrsoft
/
crawler
Library for Rapid (Web) Crawler and Scraper Development
https://www.crwlr.software/packages/crawler
MIT License
325
stars
11
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Improve warning messages for pre run validation
#164
otsch
closed
3 weeks ago
0
Add the keep methods to the StepInterface
#163
otsch
closed
3 weeks ago
0
Make maxOutputs() work with `Group` steps
#162
otsch
closed
1 month ago
0
Cache ttl prolongation
#161
otsch
closed
1 month ago
0
Remove deprecated HttpLoader methods
#160
otsch
closed
1 month ago
0
Remove/change deprecated paginator stuff
#159
otsch
closed
1 month ago
0
Remove addToResult() and multiple loaders
#158
otsch
closed
2 months ago
0
v2.0
#157
otsch
opened
2 months ago
0
Restrict retrying cached error responses
#156
otsch
closed
2 months ago
0
Fix typos
#155
szepeviktor
closed
2 months ago
0
New paginator stop rules
#154
otsch
closed
2 months ago
0
Add URL refiners
#153
otsch
closed
2 months ago
0
Mozilla 5.0 compatible user agent
#152
otsch
closed
2 months ago
0
Centralize gzip compression
#151
otsch
closed
2 months ago
0
Detect gzip encoding in Http
#150
szepeviktor
closed
2 months ago
2
Prevent PHP warnings
#149
otsch
closed
2 months ago
0
Apply cache filter also for usage of cached responses
#148
otsch
closed
2 months ago
0
Replace browserHelper() with browser()
#147
otsch
closed
3 months ago
0
Fix issue with setting chrome executable
#146
otsch
closed
3 months ago
0
Also add a getTimeout() function
#145
otsch
closed
3 months ago
0
Allow further headless chrome configuration
#144
otsch
closed
3 months ago
0
Question regarding "Failed to load % cURL error 60: SSL: no alternative certificate subject name matches target host name"
#143
severfire
closed
4 months ago
1
keep() instead of addToResult() and sub crawlers
#142
otsch
closed
4 months ago
0
Fail soft when input key is missing
#141
otsch
closed
6 months ago
0
Fix issue in Http::crawl() step
#140
otsch
closed
6 months ago
0
JSON step improvement
#139
otsch
closed
7 months ago
0
Fix issue with cache filters and redirects
#138
otsch
closed
7 months ago
0
Enable updating cached responses via the Loader
#137
otsch
closed
7 months ago
0
Enable adding to result from nested output
#136
otsch
closed
7 months ago
0
Merge HttpBaseLoader back to HttpLoader
#135
otsch
closed
8 months ago
0
Fix issue in Sitemap::getUrlsFromSitemap()
#134
otsch
closed
8 months ago
0
Process URLs from sitemap in chunks
#133
derjochenmeyer
closed
7 months ago
9
HttpLoader Separation
#132
otsch
closed
8 months ago
0
Documentation request
#131
derjochenmeyer
closed
8 months ago
2
New DomQuery::formattedText() method
#130
otsch
closed
8 months ago
0
Enable manipulating nested query params
#129
otsch
closed
8 months ago
0
Fix reading uncompressed files
#128
otsch
closed
9 months ago
0
Fix paginating with multiple initial inputs
#127
otsch
closed
9 months ago
0
Add forgotten getter method
#126
otsch
closed
10 months ago
0
Validate CSS Selectors and XPath Queries earlier
#125
otsch
closed
10 months ago
0
Symfony 7 support
#124
otsch
closed
10 months ago
0
Run tests als in PHP 8.3 in CI
#123
otsch
closed
11 months ago
0
Query params paginator and paginator improvements
#122
otsch
closed
11 months ago
0
Flexible Auto-Retries for any kind of error responses (4xx, 5xx)
#121
otsch
opened
1 year ago
2
Enable the use of Proxies
#120
otsch
closed
1 year ago
0
Fixes in Http steps
#119
otsch
closed
1 year ago
0
Fix tracking request end for redirected requests
#118
otsch
closed
1 year ago
0
New onCacheHit Loader hook
#117
otsch
closed
1 year ago
0
Move Microseconds class to utils package
#116
otsch
closed
1 year ago
0
Fix throttling when using the headless browser
#115
otsch
closed
1 year ago
0
Next