Closed martisch closed 6 years ago
Last time this happened there was a stack trace in the maintner logs (which I think is powering a lot of the data sources here) but I don't have access to the production logs for that service. @adams-sarah ?
+andy
Probably pubsubhelper:
2017/09/12 18:26:17 Client error: read error: EOF 2017/09/12 18:26:18 Client: "AUTH LOGIN\r\n", verhb: "AUTH" 2017/09/12 18:26:18 Client error: read error: EOF 2017/09/12 18:26:19 Client: "AUTH LOGIN\r\n", verhb: "AUTH" 2017/09/12 18:26:19 Client error: read error: EOF 2017/09/12 18:26:19 Client: "AUTH LOGIN\r\n", verhb: "AUTH" 2017/09/12 18:26:19 Client error: read error: EOF 2017/09/12 18:26:20 Client: "AUTH LOGIN\r\n", verhb: "AUTH" 2017/09/12 18:26:20 Client error: read error: EOF 2017/09/12 18:28:15 Client: "AUTH LOGIN\r\n", verhb: "AUTH"
Hm. https://pubsubhelper.golang.org/recent is indicating that new GitHub events are being processed, but not new Gerrit events...
Is the "AUTH LOGIN" coming from Google's MTA or just random spammers from the Internet finding a port 25 open? If the latter, it's a red herring.
But if Google has changed its delivery to now attempt AUTH, that's weird.
FWIW, our SMTP greeting is:
$ telnet pubsubhelper.golang.org 25
Trying 35.184.237.80...
Connected to pubsubhelper.golang.org.
Escape character is '^]'.
220 ESMTP gosmtpd
^]q
Maybe it needs to say something more or less to make clients not try to authenticate?
Or maybe we should just reply to any AUTH request with a success message.
But first I'd like to see if that's actually Google authenticating.
(Btw, I'm not debugging this. I still only have few minute chunks of time sporadically.)
Change https://golang.org/cl/63350 mentions this issue: cmd/pubsubhelper: log new smtp connections with addr
subrepo commits are appearing in datastore cache, but not go main repo commits since 2d69e9e259ec0f5d5fbeb3498fbd9fed135fe869 yesterday
Found the issue. The bot that is set to receive the emails had all of its watches deleted from its Gerrit account. Re-added a watch on All-Projects
and will follow up with the Gerrit team. Let’s see if this fixes the underlying problem.
:tada: :tada: thanks andy
Pubsubhelper can't be the only problem. It's only a wake-up for maintner. Maintner polls Gerrit regularly as a fallback without any pubsubhelper.
So what's the deeper problem?
maintner is seeing the new commits over and over... eg.
EDIT: pasted wrong log entry. will update in ~hr
could it be that something is off with processGerritMutation()
(maintner/gerrit.go:483)?
have to run to a mtg but will keep looking at this after.
ok they're back up. i still have no idea what happened. but this seemed related to https://github.com/golang/go/issues/21555 so tried restarting gitmirror b/c i have to leave work now. that worked...
... will look at this more tomorrow.
Copy-pasting my update from #21555 : [The root cause of this issue is that] the builders are set up such that if a commit is pushed to the dashboard, and the dashboard does not recognize the new commit's parent commit SHA, it will reject the new commit. This puts us in a place where, if we lose one commit, we can never recover (until we restart gitmirror).
http://build.golang.org does not show builder runs for the last few CLs that were submitted (~24 hours) and the github repo does not seem to have catched up on the last few commits that were submitted.