Closed amimas closed 5 years ago
Running into similar ECONRESET
error for some other external link too.
34:28 error https://jcp.org/en/jsr/detail?id=330 is dead. (request to https://jcp.org/en/jsr/detail?id=330 failed, reason: read ECONNRESET) no-dead-link
83:198 error https://www.osgi.org/developer/specifications/ is dead. (request to https://www.osgi.org/developer/specifications/ failed, reason: read ECONNRESET) no-dead-link
26:42 error https://jax-rs-spec.java.net/ is dead. (request to https://jax-rs-spec.java.net/ failed, reason: read ECONNRESET) no-dead-link
Sometimes the above errors only appear in my local machine and not in the CI pipeline. Sometimes they appear in both.
I just realized the same link gets reported as invalid with different reason. For example:
34:28 error https://jcp.org/en/jsr/detail?id=330 is dead. (request to https://jcp.org/en/jsr/detail?id=330 failed, reason: connect ECONNRESET 137.254.60.38:443) no-dead-link
This shows it's failing with reason: connect ECONNRESET
and the exact same link mentioned in previous comment had reason: read ECONNRESET
I'm also getting ECONREFUSED
error from time to time:
7:22 error http://tools.ietf.org/html/rfc6749 is dead. (request to http://tools.ietf.org/html/rfc6749 failed, reason: connect ECONNREFUSED 64.170.98.42:80) no-dead-link
16:140 error http://tools.ietf.org/html/rfc6749#page-10 is dead. (request to http://tools.ietf.org/html/rfc6749#page-10 failed, reason: connect ECONNREFUSED 64.170.98.42:80) no-dead-link
Thanks for report.
curl -H 'User-Agent:' -H 'Accept:' -H 'Host:' -v https://dev.mysql.com/downloads/mysql/
* Trying 137.254.60.11...
* TCP_NODELAY set
* Connected to dev.mysql.com (137.254.60.11) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: C=US; ST=California; L=Redwood City; O=Oracle Corporation; OU=Production Engineering and Operation; CN=www.mysql.com
* start date: Jan 25 00:00:00 2019 GMT
* expire date: Mar 25 12:00:00 2020 GMT
* subjectAltName: host "dev.mysql.com" matched cert's "dev.mysql.com"
* issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA
* SSL certificate verify ok.
> GET /downloads/mysql/ HTTP/1.1
>
< HTTP/1.1 400 Bad Request
< Date: Sun, 23 Jun 2019 03:45:28 GMT
< Server: Apache
< X-Frame-Options: SAMEORIGIN
< Content-Length: 226
< Connection: close
< Content-Type: text/html; charset=iso-8859-1
<
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
</p>
</body></html>
* Closing connection 0
* TLSv1.2 (OUT), TLS alert, Client hello (1):
Probably, mysql.com refuse a request without UserAgent.
We should add default user-agent and accept header?
I've added default User-Agent and Accept header by deault https://github.com/textlint-rule/textlint-rule-no-dead-link/pull/116 But, test is passed without UserAgent and Accept Header. Maybe, there are another reason for this issue.
Release https://github.com/textlint-rule/textlint-rule-no-dead-link/releases/tag/4.4.2
@amimas Can you try it again?
Thanks @azu. I will try it out. You're right that there could be other reasons as well. I have been getting invalid link report from some of the links inconsistently.
I just got the chance to try it out. It's probably working for the ECONNRESET
related errors. The following error disappeared after I updated to 4.4.2
83:198 error https://www.osgi.org/developer/specifications/ is dead. (request to https://www.osgi.org/developer/specifications/ failed, reason: socket hang up) no-dead-link
Update: The above error was probably a one time issue. I didn't get that error before. But, so far I haven't seen the ECONRESET
error yet. Will continue to run the tests.
But, I'm still getting ECONREFUSED
error from these two links:
7:22 error https://tools.ietf.org/html/rfc6749 is dead. (request to https://tools.ietf.org/html/rfc6749 failed, reason: connect ECONNREFUSED 64.170.98.42:443) no-dead-link
16:140 error https://tools.ietf.org/html/rfc6749#page-10 is dead. (request to https://tools.ietf.org/html/rfc6749#page-10 failed, reason: connect ECONNREFUSED 64.170.98.42:443) no-dead-link
I figure out that ieft.org require Host:
header.
curl -I -H 'User-Agent: a' -H 'Host:' -H 'Accept:' -v https://tools.ietf.org/html/rfc6749
> HEAD /html/rfc6749 HTTP/1.1
> User-Agent: a
>
< HTTP/1.1 400 Bad Request
HTTP/1.1 400 Bad Request
Same host
curl -I -H 'User-Agent: a' -H 'Host: tools.ietf.org' -H 'Accept:' -v https://tools.ietf.org/html/rfc6749
> HEAD /html/rfc6749 HTTP/1.1
> Host: tools.ietf.org
> User-Agent: a
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
That's a good finding @azu
I did some further tests from your findings and as expected if I set the User-Agent
to be what is set by Chrome browser, I get 200 OK
response from this host. For example:
$ curl -I -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36' -H 'Accept:' -v https://tools.ietf.org/html/rfc6749
* Trying 64.170.98.42...
* TCP_NODELAY set
* Connected to tools.ietf.org (64.170.98.42) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: OU=Domain Control Validated; CN=*.tools.ietf.org
* start date: Oct 1 17:24:13 2018 GMT
* expire date: Nov 30 23:34:19 2019 GMT
* subjectAltName: host "tools.ietf.org" matched cert's "tools.ietf.org"
* issuer: C=US; ST=Arizona; L=Scottsdale; O=Starfield Technologies, Inc.; OU=http://certs.starfieldtech.com/repository/; CN=Starfield Secure Certificate Authority - G2
* SSL certificate verify ok.
> HEAD /html/rfc6749 HTTP/1.1
> Host: tools.ietf.org
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36
>
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
...
These also worked:
curl -I -H 'User-Agent: Chrome' -H 'Accept:' -v https://tools.ietf.org/html/rfc6749
curl -I -H 'User-Agent: Firefox' -H 'Accept:' -v https://tools.ietf.org/html/rfc6749
I think we need the User-Agent
value to be configurable option while setting a more sensible default.
Right now the User-Agent
value is set to this rule after the latest update in 4.4.2
but it seems some web servers don't like it.
headers: {
'User-Agent': 'textlint-rule-no-dead-link/1.0',
'Accept': '*/*'
},
I am trying to decide if we need a Global configuration of User-Agent
that applies to all URLs or if we need per domain/url specific configuration of User-Agent
.
What's your thought?
curl -I -H 'Host:' -H 'User-Agent: Chrome' -H 'Accept:' -v https://tools.ietf.org/html/rfc6749
tools.ietf.org return 400 If Host is null. UA is not related with that 400.
@azu - Is it possible to release the latest fix? I can try it out and see if it fixes all of those scenarios discussed above.
It could be fixed. Please tell me if you found broken case.
Thanks
2019年7月9日(火) 0:47 amimas notifications@github.com:
@azu https://github.com/azu - Is it possible to release the latest fix? I can try it out and see if it fixes all of those scenarios discussed above.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/textlint-rule/textlint-rule-no-dead-link/issues/111?email_source=notifications&email_token=AAAE2AWP3AZLMN3TEY5KRADP6NOSZA5CNFSM4H2SNT5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZNQJNI#issuecomment-509281461, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAE2AUSNVM74MAQUWARTALP6NOSZANCNFSM4H2SNT5A .
@azu - You haven't released your latest PR #117 yet. If you can release that as version 4.4.3, I can test it out. Or please let me know if there's another way I can test it before you release it.
Oh, sorry. https://github.com/textlint-rule/textlint-rule-no-dead-link/releases/tag/4.4.3
2019年7月9日(火) 23:40 amimas notifications@github.com:
@azu https://github.com/azu - You haven't released your latest PR #117 https://github.com/textlint-rule/textlint-rule-no-dead-link/pull/117 yet. If you can release that as version 4.4.3, I can test it out. Or please let me know if there's another way I can test it before you release it.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/textlint-rule/textlint-rule-no-dead-link/issues/111?email_source=notifications&email_token=AAAE2AROEGM4BI4XKLUKZBDP6SPPFA5CNFSM4H2SNT5KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZQPKVA#issuecomment-509670740, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAE2ARMKKQT2CFP47M6E7DP6SPPFANCNFSM4H2SNT5A .
@azu - Thanks for releasing that but unfortunately it didn't seem to help. I'm still getting ECONNREFUSED
or ECONNRESET
errors from the same links as before:
61:34 error https://dev.mysql.com/downloads/mysql/ is dead. (request to https://dev.mysql.com/downloads/mysql/ failed, reason: read ECONNRESET) no-dead-link
7:22 error https://tools.ietf.org/html/rfc6749 is dead. (request to https://tools.ietf.org/html/rfc6749 failed, reason: connect ECONNREFUSED 64.170.98.42:443) no-dead-link
16:140 error https://tools.ietf.org/html/rfc6749#page-10 is dead. (request to https://tools.ietf.org/html/rfc6749#page-10 failed, reason: connect ECONNREFUSED 64.170.98.42:443) no-dead-link
On top of that the latest change in the Host
value is causing a lot of other external site's validation to fail because of this Hostname/IP does not match certificate's altnames
error. Below is couple of examples:
53:545 error http://bugs.sun.com/view_bug.do?bug_id=6570259 is dead. (request to https://bugs.java.com/view_bug.do?bug_id=6570259 failed, reason: Hostname/IP does not match certificate's altnames: Host: bugs.sun.com. is not in the cert's altnames: DNS:bugs.java.com) no-dead-link
26:34 error http://wiki.osgi.org/wiki/Blueprint is dead. (request to https://www.osgi.org/community/wiki/wiki/Blueprint failed, reason: Hostname/IP does not match certificate's altnames: Host: wiki.osgi.org. is not in the cert's altnames: DNS:*.wpengine.com, DNS:wpengine.com) no-dead-link
In addition, I'm getting a lot of errors being reported due to maximum redirect reached
error. Here're some examples:
7:51 error http://quartz-scheduler.org/ is dead. (maximum redirect reached at: http://www.quartz-scheduler.org/) no-dead-link
128:76 error http://static.springsource.org/spring/docs/2.0.x/api/org/springframework/scheduling/concurrent/ThreadPoolTaskExecutor.html is dead. (maximum redirect reached at: http://docs.spring.io/spring/docs/2.0.x/api/org/springframework/scheduling/concurrent/ThreadPoolTaskExecutor.html) no-dead-link
The latest change in Host
is definitely not fixing the original issue and causing new issue with certificate validation. Not sure why yet why the maximum redirect issues being reported now. I suggest you revert the changes in the last two releases, as 4.4.1
is still more stable release.
In the meantime, I think we need to continue to investigate this issue. Please re-open this ticket.
I have been looking into this in more details. I think the latest release (4.4.3
) is trying to do the right thing and the Hostname/IP does not match certificate's altnames
error is valid. This is due to really old links that should be replaced with appropriate valid links.
Unfortunately I can't yet see following errors are appearing even though the ignoreRedirect
option is set to true
in the rule's configuration:
503:78 error http://java.sun.com/javaee/5/docs/api/index.html?javax/persistence/EntityManager.html is dead. (301 Moved Permanently) no-dead-link
30:40 error http://static.springsource.org/spring/docs/3.0.0.RC3/reference/html/ch05s07.html is dead. (maximum redirect reached at: http://docs.spring.io/spring/docs/3.0.0.RC3/reference/html/ch05s07.html) no-dead-link
36:273 error http://www.eaipatterns.com/StoreInLibrary.html is dead. (302 Found) no-dead-link
All of these links are automatically being redirected to a newer link but the ignoreRedirect
seems to not work in these cases.
I will open a separate issue regarding ECONNREFUESED
error
ignoreRedirect**s**
?
You're right, the configuration option is plural (ignoreRedirects
). Just double checked my .textlintrc.json
and verified that I am using the correct option name. And still those redirect related errors are appearing.
I have the following snippet in my markdown file:
This rule keep failing with following error message:
The link, https://dev.mysql.com/downloads/mysql/, is valid but not sure why the rule is unable to verify this. When I try
curl -v https://dev.mysql.com/downloads/mysql/
from my terminal, I get the following headers + the site's html responses:I am getting
200 OK
response from curl and obviously I can access the same link from my browser but I can't actually ping that domain. For example,ping dev.mysql.com
comes back with 0 success. I think the server is configured to not respond to pings because I tried this external ping utility site, https://www.ipaddressguide.com/ping, and that's also coming back with 100% failures.Not sure if this is a bug within the linter or if there's room for improvements. Right now my only option is to add that domain to the
ignore
option of this rule.