WhichBrowser / Parser-PHP

Browser sniffing gone too far — A useragent parser library for PHP
http://whichbrowser.net
MIT License
1.79k stars 240 forks source link

Update `cURL` to detect more packages #631

Closed summercms closed 2 years ago

summercms commented 3 years ago

Link: https://github.com/WhichBrowser/Parser-PHP/blob/master/data/applications-bots.php#L386

/^curl\/([0-9.]*)/u

Currently the regex is as above. Straight away I see some issues with the regex.

  1. It's only detecting curl at the beginning using the ^

  2. It only detecting lower case.

It would be better to change the code to the following:

/curl\/([0-9.]*)/ui

The above then will match:

curl
Curl
cURL

Some things to note. The curl regex is near the bottom of the application-bots.php file. So regex like the following:

serpstatbot/1.0 (advanced backlink tracking bot; curl/7.58.0; http://serpstatbot.com/; abuse@serpstatbot.com)

Should find serpstatbot/1.0 before curl/7.58.0 and mark it up correctly. Which it does with the following test location:

https://github.com/WhichBrowser/Parser-PHP/blob/master/tests/data/bots/generic.yaml#L722

Below are some common UA's from the internet:

Have cherry picked some to add to the tests folder with this pr.

User agent Version OS Hardware Type Popularity
curl/7.54.0 7 --   Very common
PycURL/7.43.0.2 libcurl/7.47.0 OpenSSL/1.0.2g zlib/1.2.8 libidn/1.32 librtmp/2.3 7 --   Very common
curl/7.64.1 7 --   Very common
curl/7.20.0 (x86_64-redhat-linux-gnu) libcurl/7.20.0 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5 7 Linux Computer Common
curl/7.16.4 (x86_64-pc-linux-gnu) libcurl/7.16.4 OpenSSL/0.9.8o zlib/1.2.3 7 Linux Computer Common
serpstatbot/1.0 (advanced backlink tracking bot; curl/7.58.0; http://serpstatbot.com/; abuse@serpstatbot.com) 7 --   Common
curl/7.47.0 7 --   Common
curl/7.29.0 7 --   Common
curl/7.35.0 7 --   Common
curl/7.58.0 7 --   Average
curl/7.64.0 7 --   Average
curl/7.38.0 7 --   Average
curl/7.15.5 (x86_64-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5 7 Linux Computer Average
curl/7.68.0 7 --   Average
PycURL/7.43.0.3 libcurl/7.66.0 OpenSSL/1.1.1d zlib/1.2.11 brotli/1.0.7 libidn2/2.2.0 libpsl/0.20.2 (+libidn2/2.0.5) libssh2/1.8.0 nghttp2/1.39.2 librtmp/2.3 7 --   Average
curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.16.2.3 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 7 Linux Computer Average
curl/7.17.1 (mips-unknown-linux-gnu) libcurl/7.17.1 OpenSSL/0.9.8i zlib/1.2.3 7 Linux Computer Average
app_process64 (unknown version) curl/7.59.0 7 --   Average
curl/7.43.0 7 --   Average
curl/7.52.1 7 --   Average
curl/7.24.0 (amd64-portbld-freebsd8.3) libcurl/7.24.0 OpenSSL/0.9.8q zlib/1.2.3 7 FreeBSD Computer Average
curl/7.21.3 (amd64-portbld-freebsd8.2) libcurl/7.21.3 OpenSSL/0.9.8q zlib/1.2.3 7 FreeBSD Computer Average
PycURL/7.19.5 7 --   Average
curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3 7 Linux Computer Average
curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.19.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 7 Linux Computer Average
curl/7.26.0 7 --   Average
libcurl/7.54.1 r-curl/2.8.1 httr/1.3.1 7 --   Average
curl/7.55.1 7 --   Average
curl/7.61.1 7 --   Average
curl/7.47.1 7 --   Average
curl/7.65.3 7 --   Average
libcurl-agent/1.0   --   Average
curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.15.3 zlib/1.2.3 libidn/1.18 libssh2/1.4.2 7 Linux Computer Average
curl/7.15.5 (i686-redhat-linux-gnu) libcurl/7.15.5 OpenSSL/0.9.8b zlib/1.2.3 libidn/0.6.5 7 Linux Computer Average
curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2 7 Linux Computer Average
curl/7.21.0 (x86_64-pc-linux-gnu) libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6 7 Linux Computer Average
curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.27.1 zlib/1.2.3 libidn/1.18 libssh2/1.4.2 7 Linux Computer Average
Mozilla/5.0 (compatible; pycurl)   --   Average
curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.21 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 7 Linux Computer Average
curl/7.69.1 7 --   Average
curl/7.19.4 (i386-redhat-linux-gnu) libcurl/7.19.4 NSS/3.12.2.0 zlib/1.2.3 libidn/0.6.14 libssh2/0.18 7 Linux Computer Average
curl/7.45.0-DEV 7 --   Average
curl/7.61.0 7 --   Average
CURL (Link validity checker)   --   Average
PycURL/7.19.7 7 --   Average
curl/7.21.0 (i486-pc-linux-gnu) libcurl/7.21.0 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.15 libssh2/1.2.6 7 Linux Computer Average
curl/7.70.0 7 --   Average
PycURL/7.43.0 libcurl/7.47.0 GnuTLS/3.4.10 zlib/1.2.8 libidn/1.32 librtmp/2.3 7 --   Average
PycURL/7.43.0.5 libcurl/7.58.0 OpenSSL/1.1.1 zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3 7 --   Average
curl/7.33.0 7 --   Average
coveralls commented 3 years ago

Coverage Status

Coverage remained the same at 100.0% when pulling 0ba75d9d871780b7aa6f70ea818e10c18d6f4351 on ayumi-cloud:curl into da24adc4f4f26002673d236e69b91a10f2fd594c on WhichBrowser:master.