twitter / hadoop-lzo

Refactored version of code.google.com/hadoop-gpl-compression for hadoop 0.20
GNU General Public License v3.0
546 stars 329 forks source link

maven.twttr.com outage - 503 errors - breaks builds of downstream projects #148

Open mmisiewicz-yext opened 3 years ago

mmisiewicz-yext commented 3 years ago

It seems that maven.twttr.com is currently returning 503 when attempting to retrieve artifacts. This prevents any builds which depend on this artifact from succeeding

Describe the bug maven.twttr.com returns 503 Service Temporarily Unavailable.

To Reproduce curl -vvv https://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/hadoop-lzo-0.4.19.pom

Expected behavior Twitter Maven runs correctly.

Environment I have been able to reproduce from multiple different ISPs.

Additional context This outage has been ongoing for at least 24h now.

mattburgess commented 3 years ago

A reoccurence of #138 by the looks of it. As mentioned there, it would be really handy if these artifacts were made available on maven central as well/instead of maven.twtrr.com.

willnorris commented 3 years ago

Things seem to be back up now. I'll follow up internally to make sure we've got appropriate monitoring on this service as well as see if there's a reason these artifacts aren't also mirrored to maven central.

mattburgess commented 3 years ago

Thanks @willnorris - it appears to be down again this morning.

nenad-spuzic commented 3 years ago

@willnorris It still appears to be down

willnorris commented 3 years ago

hmm... so far I've been unable to catch it while it is down, and I don't see anything in our monitoring about 500s (though I suspect I'm just looking at the wrong monitoring). Are you seeing it down consistently or just intermittently?

mattburgess commented 3 years ago

It's down for me right now. I wonder whether it's geographical? Just in case it is, I happen to be in the UK. I'm not polling it continuously, but when I do hit it it's always been down.

nenad-spuzic commented 3 years ago

@willnorris Consistently down. Whenever I load https://maven.twttr.com/ I get a 503 error. I've loaded it at least 30 times now, general time frames:

Screen Shot 2020-12-04 at 11 17 25

Additionally, I have 9 different apps in the cloud whose builds are failing since they depend on this repository being online, at it is returning 503 for different artifacts during the build process.

It doesn't seem to be a geographical problem since, it does not work from me in Serbia, does not work over a US proxy, and builds hosted in France are failing (that depend on this repository)

I need this repository back online as soon as possible, if you need any additional information from me to start troubleshooting this, please let me now.

mmisiewicz-yext commented 3 years ago

Consistently down here from a US ISP. Here's the output from curl -vvv if that's any help:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 199.16.156.89...
* TCP_NODELAY set
* Connected to maven.twttr.com (199.16.156.89) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [229 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [66 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [2861 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Twitter, Inc.; OU=atla; CN=maven.twttr.com
*  start date: Feb  6 00:00:00 2020 GMT
*  expire date: Feb  5 12:00:00 2021 GMT
*  subjectAltName: host "maven.twttr.com" matched cert's "maven.twttr.com"
*  issuer: C=US; O=DigiCert Inc; OU=www.digicert.com; CN=DigiCert SHA2 High Assurance Server CA
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fd044808200)
> GET /com/hadoop/gplcompression/hadoop-lzo/0.4.19/hadoop-lzo-0.4.19.pom HTTP/2
> Host: maven.twttr.com
> User-Agent: curl/7.64.1
> Accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 4294967295)!
< HTTP/2 503
< cache-control: no-cache, no-store, max-age=0
< content-length: 5670
< content-security-policy: default-src 'none'; img-src https://abs.twimg.com https://ssl.google-analytics.com http://www.google-analytics.com; script-src https://abs.twimg.com https://ssl.google-analytics.com https://ajax.googleapis.com http://www.google-analytics.com about:; style-src https://abs.twimg.com https://fonts.googleapis.com 'unsafe-inline'; font-src https://abs.twimg.com https://twitter.com; connect-src 'none'; object-src 'none'; media-src 'none'; frame-src 'none'; report-uri https://twitter.com/i/csp_report?a=ORTGK%3D%3D%3D&ro=false
< content-type: text/html;charset=utf-8
< date: Fri, 04 Dec 2020 17:05:23 GMT
< server: tsa_b
< strict-transport-security: max-age=631138519
< x-connection-hash: 630d602d603279d7c161aa24f7686abf
< x-response-time: 74
< x-xss-protection: 0
<
{ [4087 bytes data]
100  5670  100  5670    0     0  14575      0 --:--:-- --:--:-- --:--:-- 14575
* Connection #0 to host maven.twttr.com left intact
* Closing connection 0

dig output:

➜  dig maven.twttr.com

; <<>> DiG 9.10.6 <<>> maven.twttr.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29943
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;maven.twttr.com.       IN  A

;; ANSWER SECTION:
maven.twttr.com.    203 IN  CNAME   maven.twitter.com.
maven.twitter.com.  203 IN  A   199.16.156.89
brahma19 commented 3 years ago

This seems to be working from IND / US ISP as of 06/12 , 19:05 IST.

willnorris commented 3 years ago

as @brahma19 noted, the immediate problem should be resolved. I'm still working on the longer term solution to try to make sure this doesn't happen again.

pjfanning commented 2 years ago

@willnorris this is happening again today

Would it be possible to publish hadoop-lzo to oss.sonatype.org like most companies do? Jars published to oss.sonatype.org end up being duplicated around many maven repos, most notably https://repo1.maven.org/

2022-03-23T12:57:08.1776904Z [ERROR] Failed to execute goal on project zeppelin-scalding_2.10: Could not resolve dependencies for project org.apache.zeppelin:zeppelin-scalding_2.10:jar:0.11.0-SNAPSHOT: Could not transfer artifact com.hadoop.gplcompression:hadoop-lzo:jar:0.4.19 from/to twitter (https://maven.twttr.com): transfer failed for https://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.19/hadoop-lzo-0.4.19.jar, status: 503 Service Unavailable -> [Help 1]
willnorris commented 2 years ago

Gah! Thanks for letting me know. I'll get into it today.

willnorris commented 2 years ago

okay, it's back up. Immediate fire extinguished, but I'll be continuing to work on removing the reliance on maven.twttr.com and to just get this (and others) published into maven central.

willnorris commented 2 years ago

Filed an issue with Hadoop project to track discussion of publishing on maven central with the current groupId: https://issues.apache.org/jira/browse/HADOOP-18223

pedroslopez commented 1 year ago

This has been happening once again - maven.twttr.com is down with a 503

curl -vvv https://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.20/hadoop-lzo-0.4.20.pom
*   Trying 199.59.149.208:443...
* Connected to maven.twttr.com (199.59.149.208) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/cert.pem
*  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: C=US; ST=California; L=San Francisco; O=Twitter, Inc.; CN=maven.twttr.com
*  start date: Nov 14 00:00:00 2022 GMT
*  expire date: Nov 14 23:59:59 2023 GMT
*  subjectAltName: host "maven.twttr.com" matched cert's "maven.twttr.com"
*  issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x15b011a00)
> GET /com/hadoop/gplcompression/hadoop-lzo/0.4.20/hadoop-lzo-0.4.20.pom HTTP/2
> Host: maven.twttr.com
> user-agent: curl/7.77.0
> accept: */*
> 
< HTTP/2 503 
< content-length: 91
< content-type: text/plain
< x-connection-hash: 282072b4961e6f43bb64596f79d8f79f8f4e3784c228582ad8222bc439c34a87
< date: Thu, 05 Jan 2023 18:56:29 GMT
< server: tsa_a
< 
* Connection #0 to host maven.twttr.com left intact
upstream connect error or disconnect/reset before headers. reset reason: connection failure

@willnorris was this issue ever resolved?

mmisiewicz-yext commented 1 year ago

Maybe Elon auctioned off the server.

On Thu, Jan 5, 2023 at 1:57 PM Pedro S. Lopez @.***> wrote:

This has been happening once again - maven.twttr.com is down with a 503

curl -vvv https://maven.twttr.com/com/hadoop/gplcompression/hadoop-lzo/0.4.20/hadoop-lzo-0.4.20.pom

  • Trying 199.59.149.208:443...
  • Connected to maven.twttr.com (199.59.149.208) port 443 (#0)
  • ALPN, offering h2
  • ALPN, offering http/1.1
  • successfully set certificate verify locations:
  • CAfile: /etc/ssl/cert.pem
  • CApath: none
  • TLSv1.2 (OUT), TLS handshake, Client hello (1):
  • TLSv1.2 (IN), TLS handshake, Server hello (2):
  • TLSv1.2 (IN), TLS handshake, Certificate (11):
  • TLSv1.2 (IN), TLS handshake, Server key exchange (12):
  • TLSv1.2 (IN), TLS handshake, Server finished (14):
  • TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
  • TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
  • TLSv1.2 (OUT), TLS handshake, Finished (20):
  • TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
  • TLSv1.2 (IN), TLS handshake, Finished (20):
  • SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
  • ALPN, server accepted to use h2
  • Server certificate:
  • subject: C=US; ST=California; L=San Francisco; O=Twitter, Inc.; CN=maven.twttr.com
  • start date: Nov 14 00:00:00 2022 GMT
  • expire date: Nov 14 23:59:59 2023 GMT
  • subjectAltName: host "maven.twttr.com" matched cert's "maven.twttr.com"
  • issuer: C=US; O=DigiCert Inc; CN=DigiCert TLS RSA SHA256 2020 CA1
  • SSL certificate verify ok.
  • Using HTTP2, server supports multi-use
  • Connection state changed (HTTP/2 confirmed)
  • Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
  • Using Stream ID: 1 (easy handle 0x15b011a00)

    GET /com/hadoop/gplcompression/hadoop-lzo/0.4.20/hadoop-lzo-0.4.20.pom HTTP/2 Host: maven.twttr.com user-agent: curl/7.77.0 accept: /

    < HTTP/2 503 < content-length: 91 < content-type: text/plain < x-connection-hash: 282072b4961e6f43bb64596f79d8f79f8f4e3784c228582ad8222bc439c34a87 < date: Thu, 05 Jan 2023 18:56:29 GMT < server: tsa_a <

  • Connection #0 to host maven.twttr.com left intact upstream connect error or disconnect/reset before headers. reset reason: connection failure

@willnorris https://github.com/willnorris was this issue ever resolved?

— Reply to this email directly, view it on GitHub https://github.com/twitter/hadoop-lzo/issues/148#issuecomment-1372610254, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALSZMZT4U5O5PWJU3LUUTUTWQ4KSFANCNFSM4UJJKCNQ . You are receiving this because you authored the thread.Message ID: @.***>

--

Michael Misiewicz

Director, Data Science

@.***

willnorris commented 1 year ago

The general response from the Hadoop project was that this shouldn't be published in Maven Central with the current artifact ID in the com.hadoop namespace. We had begun the work on moving it to a new com.twitter ID and publish it under that, but it wasn't completed before all of the involved folks left Twitter. I am not optimistic that this would rise to anyone's attention that is still at the company.

My recommendation would be:

As for getting maven.twttr.com back online... I'm honestly not sure what to tell you. Every single one of the people I would have reached out to was laid off. I don't even know who is left at Twitter or how to get in touch with them. If I recall, it wasn't super hard to get it back online last time (restarting an Apache server maybe? perhaps something related to an SSL cert).

I guess my biggest regret is that I didn't do more to shore up some of this stuff before I left. 😕 🤷🏻

davinchia commented 1 year ago

We managed to compile a version of this jar internally and are considering making this public in our own hosted repository if there is enough interest. Would it be helpful to folks?

willnorris commented 1 year ago

Actually, now that I think more about it... I believe maven.twttr.com was likely only running in the SMF1 data center in Sacramento (that's where a lot of internal tools ran). If reports of Twitter shutting down that data center are true, then it's very possibly that's why this went offline, and the chances of getting it back online are rather slim.

saurabh-deochake commented 1 year ago

Hi all, apologies for the inconvenience. We did a configuration change and the site should be up now. I came across this thread and was able to find folks who could help. Please note that I am not active support for this site. But, happy to help check internally. Thanks!

zman0900 commented 6 months ago

Looks like it is dead again with 503 error. Plus the status page https://status.twitterstat.us/ that is linked from that error page has an expired certificate.

vxrl-esimplicity commented 6 months ago

Looks like it is dead again with 503 error. Plus the status page https://status.twitterstat.us/ that is linked from that error page has an expired certificate.

@zman0900 Hello. Any updates on this error? TIA.

rohitdogra99 commented 6 months ago

Hi Team, Any upates