Closed snarfed closed 2 months ago
I think this will be based on MST diff, which I think we have fully built out in arroba, and maybe tested, but it's definitely not actually used yet, much less mature. Hrm.
I implemented this a simpler way, but it's not working. 😕 We're now serving most of these requests ok, but the relay doesn't seem to like the CARs we're serving. We're (theoretically) not including blocks that were originally generated in other repos before since
, background in https://github.com/snarfed/bridgy-fed/issues/1016#issuecomment-2118374522, maybe that's why? I suspect that's extremely rare, I kind of doubt we actually have many of those, if any, but I don't know for sure.
https://atproto.tools/records isn't showing any recent records for these repos, which means it's the relay that's getting stuck on them, not the appview.
I'll try to grab someone from the Bluesky team and debug more with them.
Not much luck there yet. A next idea here would be to try serving full getRepo
responses for these repos specifically, maybe from the router service so they don't get deadlined. If that works, it'll be clear evidence that the since
implementation is the problem.
Going to try a different tack here and start tracking down invalid blocks and MST nodes I'm emitting. First up, reposts without subject, eg:
bafyreiaefu2zqyyj6zjpdy3wlnor7lujyugf4prirhegnevpxmk5f425sa
{
"$type": "app.bsky.feed.repost",
"createdAt": "2024-07-07T02:41:07.924Z",
"subject": {
"cid": "",
"uri": ""
}
}
714c317dd473489b21ce38ea2117a960a791c4b7 is looking promising, haven't seen any more of those blank subject reposts since that was deployed. 🤞
Trying yet another new since
implementation in snarfed/arroba@57210738a78b9bb08cf29f3f1a23607cc3168e1f. Looking good so far, serving somewhat faster and cheaper. Still hasn't unstuck any of the handful of stuck repos I'm watching here though.
New since
implementation looks good, not a huge win, but still a win, and more importantly it fixed a number of the smaller stuck repos. As for the small handful of bigger stuck repos, eg https://bsky.app/profile/breakingnews.newsmast.community.ap.brid.gy , I had to reset them manually, but they're up and running again now too.
Here are my raw notes from the two ways I tried to fix the bigger stuck repos:
# first, delete DNS record. then:
a = ActivityPub.get_by_id('https://newsmast.community/users/uspolitics')
a.enabled_protocols=[]; a.copies=[]; a.put(); a.obj.copies=[]; a.obj.put()
a.enable_protocol(ATProto)
import arroba.server
arroba.server.storage.tombstone_repo(arroba.server.storage.load_repo('did:plc:...'))
from arroba.repo import Write
from arroba.storage import Action
did = 'did:plc:...'
repo = AtpRepo.get_by_id(did)
AtpBlock.query(AtpBlock.repo == repo.key, AtpBlock.ops.action == 'create').count()
blocks = AtpBlock.query(AtpBlock.repo == repo.key, AtpBlock.ops.action == 'create').fetch()
start = AtpBlock.query(AtpBlock.repo == repo.key, AtpBlock.ops.action == 'create',
AtpBlock.ops.path == 'app.bsky.feed.post/3kwi6acz2q7a2').get()
start.seq
AtpBlock.query(AtpBlock.repo == repo.key, AtpBlock.seq > start.seq).count()
blocks = AtpBlock.query(AtpBlock.repo == repo.key, AtpBlock.seq > start.seq).fetch()
rkeys = set()
for block in sorted(blocks, lambda b: b.seq):
for op in block.ops:
coll, rkey = op.path.split('/')
if coll == 'app.bsky.feed.repost':
if op.action == 'create':
rkeys.add(rkey)
elif op.action == 'delete':
rkeys.discard(rkey)
import arroba.server
r = arroba.server.storage.load_repo(did)
# takes minutes or longer on big repos
contents = r.get_contents()
# r.mst.list(after='app.bsky.feed.repost', before='app.bsky.feed.reposu')
rkeys = [rkey for rkey, record in contents['app.bsky.feed.repost'].items()
if not record['subject'].get('cid') or not record['subject'].get('uri')]
writes = [Write(action=Action.DELETE, collection='app.bsky.feed.repost', rkey=rkey)
for rkey in rkeys]
# polyfill; this was only added in Python 3.12
from itertools import islice
def batched(iterable, n):
if n < 1:
raise ValueError('n must be at least one')
iterator = iter(iterable)
while batch := tuple(islice(iterator, n)):
yield batch
# takes minutes or longer
for batch in batched(writes, 500):
r.apply_writes(batch)
I also reset:
I'd like to do these too, but they actually have followers, and I can't switch those on my own since they're in the followers' repos, which I can't write to. 😕
Right now we have a few big repos that
getRepo
runs out of memory on and crashes because it ignoressince
and tries to load them fully into memory first. Should be straightforward.(For now, snarfed/arroba@8d043690c0315e59d4b301fdf3763b2ccd3c8268 stopped the bleeding.)