Open sypets opened 3 years ago
Analysis of some URLs which are causing problems.
Currently brofix sends the following HTTP headers (see TSconfig):
User-Agent: configurable
Accept: */*
Accept-Language: *
Accept-Encoding: *
It looks like the Accept-Language / Accept-Encoding may be causing problems in some cases.
It is possible to simulate this with curl:
curl -IL -H "Accept-Language: *" -H "Accept-Encoding: *"
curl sends these headers (by default):
curl -ILv URL
HEAD /pages/de/news411455 HTTP/2 Host: idw-online.de user-agent: curl/7.68.0 accept: /
be sure to add the -L to follow redirects ....
Example:
curl -I "https://www.ylook.de/search.php?&linklist_idx=11116563"
curl: (60) SSL certificate problem: unable to get local issuer certificate
This could be done with an extra tool but should not be implemented in brofix.
That is a good start, but instead of extending the client, I suggest creating an event subscriber that can work for both synchronous and asynchronous requests.
Use custom CA bundle
The same error code (curl(60)) may also be the result for more severe TLS / certificate issues.
Same error code but different error message (in command line curl) !
certificate has expired self-signed certificate
Example:
curl -I "https://klimakongressoldenburg.de"
curl: (60) SSL certificate problem: self-signed certificate
Certificate has "Certificate Name mismatch, see Qualis SSL Labs
curl -I https://openjournal.uni-oldenburg.de/
curl: (60) SSL certificate problem: certificate has expired
guzzle
other
<tr>
<td align="center" valign="middle">
<div class="cf-browser-verification cf-im-under-attack">
<noscript>
<h1 data-translate="turn_on_js" style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1>
todos
summary
So far, the following reasons for false positives could be verified:
certificate chain issue (this is actually an error on the server side of the webserver which is checked, but it is a minor error and page can be loaded without warning in browser, so this is perceived (!) as not broken by user (this should be distinguished from other TLS security isssues, such as outdated certificate etc.)
error is usually curl 60 (can be verified by using curl -I "url" on server
SSLLabs shows "chain issues: incomplete" and "extra download"
to fix on server side: put complete certificate chain in certifcate (including intermediate certificates)
to fix on client server side (where brofix is running): download intermediate certificates
cloudflare
problem description
some URLs are reported as errors even though they work (in browser)
Examples:
other
Apart from this, all 401, 403 (access restricted URLs) will fail. In that case, it is not really an error, but expected. For these cases, they could either be added as exclude link target entry, or we could make external link type errors configurable (e.g. have an exclude list for that as well, where you could exclude for example 401, 403, maybe also "too many redirects").
see also: https://notes.typo3.org/linkvalidator_problem_external_urls
Related:
34