nanos / FediFetcher

FediFetcher is a tool for Mastodon that automatically fetches missing replies and posts from other fediverse instances, and adds them to your own Mastodon instance.
https://blog.thms.uk/fedifetcher?utm_source=github
MIT License
297 stars 215 forks source link

Error while running fedifetcher: "Error getting user ID" on certain URLs and "KeyError: 'url'" #44

Closed b2cc closed 1 year ago

b2cc commented 1 year ago

Hello!

Lately I have observed the following errors while running fedifetcher:

The URLs it complains work for me in the browser.

2023-04-29 19:30:52.973901 CEST: Error parsing toot URL https://digitalcourage.video/videos/watch/80110bb9-2048-44cc-8a77-0f0b42da900e
2023-04-29 19:30:54.483262 CEST: Error parsing toot URL https://digitalcourage.video/videos/watch/d6a27ca1-9fde-4821-9b8a-ae25c8515609
2023-04-29 19:30:59.137947 CEST: Error getting context for toot None. Status code: 401
2023-04-29 19:32:11.349357 CEST: Error parsing toot URL https://anonsys.net/display/bf69967c-6664-495d-6e3a-a16077737418
2023-04-29 19:32:56.704193 CEST: Error getting user ID for user digitalcourage.de@digitalcourage.video: User digitalcourage.de was not found on server digitalcourage.video/accounts.
2023-04-29 19:32:56.858679 CEST: Error getting user ID for user bba@digitalcourage.video: User bba was not found on server digitalcourage.video/video-channels.
2023-04-29 19:32:56.995373 CEST: Error getting user ID for user digitalcourage.de@digitalcourage.video: User digitalcourage.de was not found on server digitalcourage.video/accounts.
2023-04-29 19:32:57.550275 CEST: Error getting user ID for user amy@types.pl: Error getting URL https://types.pl/api/v1/accounts/lookup?acct=amy. Status code: 401
2023-04-29 19:32:58.050662 CEST: Error getting user ID for user hypolite@friendica.mrpetovan.com: Expecting value: line 1 column 1 (char 0)
2023-04-29 19:32:58.311605 CEST: Error getting user ID for user transitionhausbt@venera.social: Expecting value: line 1 column 1 (char 0)
2023-04-29 19:50:20.165598 CEST: Job failed after 0:02:20.695650.
Traceback (most recent call last):
  File "/app/find_posts.py", line 878, in <module>
    add_user_posts(arguments.server, token, filter_known_users(mentioned_users, all_known_users), recently_checked_users, all_known_users, seen_urls)
  File "/app/find_posts.py", line 75, in add_user_posts
    if post['reblog'] == None and post['url'] != None and post['url'] not in seen_urls:
                                  ~~~~^^^^^^^
KeyError: 'url'

Could you take a look?

Thanks as always for your work!

nanos commented 1 year ago

Hi @b2cc

thanks for that. The 'Error parsing toot URL', 'Error getting context for toot None', and 'Error getting user ID' errors are all because these are toots / users from unsupported software such as peer tube.

I'd be happy to accept PRs that implement support for further software, but at the moment these toots and users are simply skipped.

This is more unexpected though:

2023-04-29 19:50:20.165598 CEST: Job failed after 0:02:20.695650.

I do feel that there is crucial information missing though. Are you able to upload a full log, please? I can then look into it

b2cc commented 1 year ago

Hi,

I understand, then let's ignore the peertube errors.

Here is a failed log from a current run:

2023-04-30 14:45:10.224214 CEST: Starting FediFetcher
2023-04-30 14:45:12.210999 CEST: Found active user: esperantocollective_vienna
2023-04-30 14:45:12.842749 CEST: Found active user: roland_w
2023-04-30 14:45:13.407944 CEST: Found active user: gunklbot
2023-04-30 14:45:14.222352 CEST: Found 0 reply toots
2023-04-30 14:45:14.222447 CEST: Found 0 known context toots
2023-04-30 14:45:14.222534 CEST: Added 0 new context toots (with 0 failures)
2023-04-30 14:45:25.230835 CEST: Found 260 toots in timeline
2023-04-30 14:45:25.535041 CEST: Got context for toot None
2023-04-30 14:45:25.824994 CEST: Got context for toot None
2023-04-30 14:45:26.006872 CEST: Got context for toot None
2023-04-30 14:45:26.236387 CEST: Got context for toot https://hachyderm.io/@Martindotnet/110287584615058310
2023-04-30 14:45:26.515591 CEST: Got context for toot None
2023-04-30 14:45:26.673573 CEST: Got context for toot https://aut.social/@bernhard/110287413319610987
2023-04-30 14:45:26.875856 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110287297074604228
2023-04-30 14:45:27.092613 CEST: Got context for toot https://mastodon.social/@lenzgr/110287267707069042
2023-04-30 14:45:27.765308 CEST: Got context for toot None
2023-04-30 14:45:28.299109 CEST: Got context for toot None
2023-04-30 14:45:28.656461 CEST: Got context for toot None
2023-04-30 14:45:28.914532 CEST: Got context for toot None
2023-04-30 14:45:29.161544 CEST: Got context for toot None
2023-04-30 14:45:29.393401 CEST: Got context for toot None
2023-04-30 14:45:29.616969 CEST: Got context for toot None
2023-04-30 14:45:29.879347 CEST: Got context for toot None
2023-04-30 14:45:30.134010 CEST: Got context for toot None
2023-04-30 14:45:30.303378 CEST: Got context for toot None
2023-04-30 14:45:30.693247 CEST: Got context for toot https://cloud-native.social/@saiyam/110287136504862220
2023-04-30 14:45:30.869777 CEST: Got context for toot https://hachyderm.io/@zatomas/110287029126645805
2023-04-30 14:45:30.986809 CEST: Got context for toot https://aut.social/@bernhard/110286907898220214
2023-04-30 14:45:31.815068 CEST: Discovered redirect for URL https://social.kernel.org/objects/a12b8afa-e6d4-4728-a883-29e27bfe1515
2023-04-30 14:45:32.630134 CEST: Got context for toot None
2023-04-30 14:45:33.178963 CEST: Got context for toot None
2023-04-30 14:45:33.352700 CEST: Got context for toot None
2023-04-30 14:45:33.576993 CEST: Got context for toot https://infosec.exchange/@letsencrypt/110284856209495778
2023-04-30 14:45:33.761074 CEST: Got context for toot https://mastodon.social/@effinbirds/110284851305168069
2023-04-30 14:45:33.926131 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110284289013744298
2023-04-30 14:45:34.228388 CEST: Got context for toot https://mastodon.social/@effinbirds/110284001984000301
2023-04-30 14:45:35.019812 CEST: Got context for toot None
2023-04-30 14:45:35.218117 CEST: Got context for toot https://mastodon.social/@effinbirds/110283219579638181
2023-04-30 14:45:35.488838 CEST: Got context for toot None
2023-04-30 14:45:35.952581 CEST: Got context for toot https://mstdn.social/@feditips/110282867010964061
2023-04-30 14:45:36.115800 CEST: Got context for toot https://mstdn.social/@tanweerdar/110282833666725775
2023-04-30 14:45:36.423547 CEST: Got context for toot None
2023-04-30 14:45:36.623051 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110282720076702784
2023-04-30 14:45:36.861597 CEST: Got context for toot https://social.tchncs.de/@kuketzblog/110282653403169476
2023-04-30 14:45:37.146303 CEST: Got context for toot None
2023-04-30 14:45:37.146934 CEST: Error parsing toot URL https://digitalcourage.video/videos/watch/80110bb9-2048-44cc-8a77-0f0b42da900e
2023-04-30 14:45:37.358651 CEST: Got context for toot https://mastodon.social/@effinbirds/110282492031152082
2023-04-30 14:45:37.594456 CEST: Got context for toot None
2023-04-30 14:45:38.198146 CEST: Got context for toot None
2023-04-30 14:45:38.439423 CEST: Got context for toot https://fosstodon.org/@kernellogger/110282350145400526
2023-04-30 14:45:38.687111 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110282030390617805
2023-04-30 14:45:38.687472 CEST: Error parsing toot URL https://digitalcourage.video/videos/watch/d6a27ca1-9fde-4821-9b8a-ae25c8515609
2023-04-30 14:45:38.942176 CEST: Got context for toot https://ublog.tech/@IPngNetworks/110281893500422871
2023-04-30 14:45:39.118184 CEST: Got context for toot https://mastodon.derstandard.at/@web/110281536462189131
2023-04-30 14:45:39.553859 CEST: Got context for toot None
2023-04-30 14:45:39.768080 CEST: Got context for toot https://mastodon.social/@nicorola/110281446310399347
2023-04-30 14:45:39.988256 CEST: Got context for toot https://mas.to/@marcopogo/110281438099694447
2023-04-30 14:45:40.125731 CEST: Got context for toot https://aut.social/@bernhard/110281417844848973
2023-04-30 14:45:40.315102 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110281025295534594
2023-04-30 14:45:40.618086 CEST: Got context for toot None
2023-04-30 14:45:40.896360 CEST: Got context for toot None
2023-04-30 14:45:41.050583 CEST: Got context for toot https://chaos.social/@epicenter_works/110280985990730388
2023-04-30 14:45:41.231799 CEST: Got context for toot https://mastodon.social/@nicorola/110280812555397064
2023-04-30 14:45:41.440239 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110280728782230553
2023-04-30 14:45:41.840453 CEST: Got context for toot None
2023-04-30 14:45:42.064359 CEST: Got context for toot https://social.opendesktop.org/@thisweekinkde/110280415916402820
2023-04-30 14:45:42.399262 CEST: Got context for toot https://hachyderm.io/@volkan/110280154384709385
2023-04-30 14:45:43.176930 CEST: Got context for toot None
2023-04-30 14:45:43.733392 CEST: Error getting context for toot None. Status code: 401
2023-04-30 14:45:44.690163 CEST: Got context for toot None
2023-04-30 14:45:45.724906 CEST: Got context for toot https://aus.social/@dgar/110279274329093158
2023-04-30 14:45:45.967176 CEST: Got context for toot https://mastodon.social/@effinbirds/110279189011064254
2023-04-30 14:45:46.356107 CEST: Got context for toot https://mastodon.social/@KydiaMusic/110279017066632838
2023-04-30 14:45:46.878857 CEST: Got context for toot None
2023-04-30 14:45:47.079018 CEST: Got context for toot https://mastodon.xyz/@xkcd/110278690633039540
2023-04-30 14:45:47.451374 CEST: Got context for toot None
2023-04-30 14:45:48.857660 CEST: Got context for toot None
2023-04-30 14:45:49.849319 CEST: Got context for toot None
2023-04-30 14:45:50.303190 CEST: Got context for toot None
2023-04-30 14:45:50.836012 CEST: Got context for toot None
2023-04-30 14:45:51.106391 CEST: Got context for toot None
2023-04-30 14:45:51.476231 CEST: Got context for toot None
2023-04-30 14:45:51.743183 CEST: Got context for toot None
2023-04-30 14:45:51.975188 CEST: Got context for toot None
2023-04-30 14:45:52.782476 CEST: Got context for toot None
2023-04-30 14:45:53.180889 CEST: Got context for toot None
2023-04-30 14:45:53.349333 CEST: Got context for toot None
2023-04-30 14:45:54.194180 CEST: Got context for toot https://social.tchncs.de/@kuketzblog/110278235794716210
2023-04-30 14:45:54.415474 CEST: Got context for toot https://mastodon.social/@effinbirds/110278111612141534
2023-04-30 14:45:55.374264 CEST: Got context for toot None
2023-04-30 14:45:55.609112 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277834076718028
2023-04-30 14:45:56.120584 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277817823007086
2023-04-30 14:45:56.338411 CEST: Got context for toot None
2023-04-30 14:45:56.657654 CEST: Got context for toot None
2023-04-30 14:45:57.184152 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277760031436221
2023-04-30 14:45:57.570756 CEST: Got context for toot None
2023-04-30 14:45:57.762183 CEST: Got context for toot https://mastodon.social/@sjvn/110277742968854135
2023-04-30 14:45:58.015609 CEST: Got context for toot https://aut.social/@bernhard/110277732860325420
2023-04-30 14:45:58.190438 CEST: Got context for toot None
2023-04-30 14:45:58.379660 CEST: Got context for toot https://social.tchncs.de/@random_musings/110277730784037217
2023-04-30 14:45:58.919294 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277728697557991
2023-04-30 14:45:59.424557 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277710154148747
2023-04-30 14:45:59.934852 CEST: Got context for toot None
2023-04-30 14:46:00.147151 CEST: Got context for toot None
2023-04-30 14:46:00.382848 CEST: Got context for toot None
2023-04-30 14:46:00.769411 CEST: Got context for toot https://ublog.tech/@IPngNetworks/110277679390282774
2023-04-30 14:46:01.155468 CEST: Got context for toot https://ublog.tech/@IPngNetworks/110277667478587903
2023-04-30 14:46:01.682433 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277660931499636
2023-04-30 14:46:01.868417 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277648006143960
2023-04-30 14:46:02.179747 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277642786072636
2023-04-30 14:46:02.460889 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277620605502755
2023-04-30 14:46:02.860692 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277618831175533
2023-04-30 14:46:03.236261 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277588792707438
2023-04-30 14:46:03.468280 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277577473685787
2023-04-30 14:46:03.818019 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277548810949775
2023-04-30 14:46:04.164807 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277520609798956
2023-04-30 14:46:04.938268 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277500588388951
2023-04-30 14:46:05.190166 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277492677368574
2023-04-30 14:46:05.381009 CEST: Got context for toot None
2023-04-30 14:46:05.569261 CEST: Got context for toot None
2023-04-30 14:46:05.806285 CEST: Got context for toot None
2023-04-30 14:46:06.071147 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277427017479396
2023-04-30 14:46:06.385175 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110277406212701717
2023-04-30 14:46:06.618355 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277403042140890
2023-04-30 14:46:07.087143 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277372536710781
2023-04-30 14:46:07.280647 CEST: Got context for toot https://infosec.exchange/@letsencrypt/110277330853487925
2023-04-30 14:46:07.551330 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277313794402700
2023-04-30 14:46:07.858825 CEST: Got context for toot None
2023-04-30 14:46:08.042705 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277288237647630
2023-04-30 14:46:08.291210 CEST: Got context for toot None
2023-04-30 14:46:08.576282 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277236386461189
2023-04-30 14:46:08.844138 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277233622740975
2023-04-30 14:46:09.101976 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277212021469321
2023-04-30 14:46:09.344972 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277208013230715
2023-04-30 14:46:09.511098 CEST: Got context for toot None
2023-04-30 14:46:09.852875 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277162146720685
2023-04-30 14:46:10.146409 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110277127631853800
2023-04-30 14:46:10.738453 CEST: Got context for toot https://me.dm/@medium/110277098102140256
2023-04-30 14:46:10.996705 CEST: Got context for toot https://miau.le-chat-a-velo.at/@gudroot/110277091978958597
2023-04-30 14:46:11.296933 CEST: Got context for toot https://me.dm/@medium/110277082026859388
2023-04-30 14:46:11.575920 CEST: Got context for toot https://mstdn.social/@feditips/110277066748436943
2023-04-30 14:46:11.783407 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110277053835706776
2023-04-30 14:46:11.937552 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110277053818751147
2023-04-30 14:46:12.103205 CEST: Got context for toot https://social.anoxinon.de/@gnulinux/110277041942574433
2023-04-30 14:46:12.525619 CEST: Got context for toot None
2023-04-30 14:46:12.690062 CEST: Got context for toot https://social.anoxinon.de/@gnulinux/110276920047209560
2023-04-30 14:46:13.498565 CEST: Got context for toot None
2023-04-30 14:46:13.719726 CEST: Got context for toot https://mastodon.social/@TheMetalDog/110276879612552316
2023-04-30 14:46:13.872208 CEST: Got context for toot https://social.anoxinon.de/@gnulinux/110276857198094551
2023-04-30 14:46:14.092142 CEST: Got context for toot https://fnordon.de/@dentaku/110276849255710407
2023-04-30 14:46:14.363364 CEST: Got context for toot None
2023-04-30 14:46:14.586919 CEST: Got context for toot https://mastodon.social/@effinbirds/110276829769490527
2023-04-30 14:46:14.746832 CEST: Got context for toot https://social.anoxinon.de/@gnulinux/110276821774749914
2023-04-30 14:46:14.925921 CEST: Got context for toot None
2023-04-30 14:46:15.106759 CEST: Got context for toot https://mstdn.social/@anne_reko/110276664129404024
2023-04-30 14:46:16.098857 CEST: Got context for toot None
2023-04-30 14:46:16.329386 CEST: Got context for toot https://wien.rocks/@b2c/110276249144949514
2023-04-30 14:46:16.463010 CEST: Error getting context for toot https://wien.rocks/@b2c/110276227552133452. Status code: 404
2023-04-30 14:46:16.743699 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110276077745346285
2023-04-30 14:46:16.860116 CEST: Got context for toot https://aut.social/@bernhard/110276052889522079
2023-04-30 14:46:17.104108 CEST: Got context for toot None
2023-04-30 14:46:17.314641 CEST: Got context for toot https://aut.social/@bernhard/110275933579020836
2023-04-30 14:46:17.427206 CEST: Got context for toot https://aut.social/@bernhard/110275736908276618
2023-04-30 14:46:17.669279 CEST: Got context for toot None
2023-04-30 14:46:17.927980 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110275628684933945
2023-04-30 14:46:18.158319 CEST: Got context for toot None
2023-04-30 14:46:18.357435 CEST: Got context for toot https://mastodon.social/@Tzwehn/110275511558546966
2023-04-30 14:46:18.494321 CEST: Got context for toot https://social.anoxinon.de/@gnulinux/110275480880310117
2023-04-30 14:46:18.750618 CEST: Got context for toot https://mastodon.social/@Gargron/110275477450180746
2023-04-30 14:46:18.973046 CEST: Got context for toot https://aut.social/@bernhard/110275469960035857
2023-04-30 14:46:19.642861 CEST: Got context for toot https://social.tchncs.de/@kuketzblog/110275395111987768
2023-04-30 14:46:19.778521 CEST: Got context for toot https://aut.social/@bernhard/110275259603357479
2023-04-30 14:46:19.992562 CEST: Got context for toot None
2023-04-30 14:46:20.171217 CEST: Got context for toot https://social.tchncs.de/@random_musings/110275116769177577
2023-04-30 14:46:20.334593 CEST: Got context for toot https://social.tchncs.de/@random_musings/110274990756972594
2023-04-30 14:46:21.169074 CEST: Discovered redirect for URL https://social.kernel.org/objects/243003ef-27ba-437f-ac03-817ce5ce5d44
2023-04-30 14:46:21.948131 CEST: Got context for toot None
2023-04-30 14:46:22.730388 CEST: Got context for toot None
2023-04-30 14:46:24.509577 CEST: Got context for toot None
2023-04-30 14:46:24.731111 CEST: Got context for toot https://mastodon.social/@effinbirds/110273526704764993
2023-04-30 14:46:25.778606 CEST: Got context for toot https://mastodon.social/@Gargron/110273171231251728
2023-04-30 14:46:26.233568 CEST: Got context for toot None
2023-04-30 14:46:26.570518 CEST: Got context for toot https://mstdn.social/@feditips/110272968407733255
2023-04-30 14:46:26.977867 CEST: Got context for toot https://vis.social/@kristinHenry/110272751456172309
2023-04-30 14:46:27.175762 CEST: Got context for toot None
2023-04-30 14:46:27.376648 CEST: Got context for toot None
2023-04-30 14:46:29.218148 CEST: Got context for toot None
2023-04-30 14:46:29.913603 CEST: Got context for toot https://mastodon.social/@Gargron/110272552093771675
2023-04-30 14:46:30.211411 CEST: Got context for toot https://mastodon.social/@effinbirds/110272437511297642
2023-04-30 14:46:31.960662 CEST: Got context for toot None
2023-04-30 14:46:32.410376 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110272369668582032
2023-04-30 14:46:32.608969 CEST: Got context for toot None
2023-04-30 14:46:32.941548 CEST: Got context for toot https://ublog.tech/@IPngNetworks/110272337565726554
2023-04-30 14:46:33.115736 CEST: Got context for toot https://social.tchncs.de/@random_musings/110272324702447476
2023-04-30 14:46:33.307300 CEST: Got context for toot https://troet.cafe/@MatsurikaGaming/110272237499466919
2023-04-30 14:46:33.575575 CEST: Got context for toot https://digitalcourage.social/@digitalcourage/110272104795912732
2023-04-30 14:46:33.944883 CEST: Got context for toot https://ublog.tech/@IPngNetworks/110272087920764263
2023-04-30 14:46:34.234323 CEST: Got context for toot https://ublog.tech/@IPngNetworks/110272080576874786
2023-04-30 14:46:34.819889 CEST: Got context for toot None
2023-04-30 14:46:34.983597 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110271922383169617
2023-04-30 14:46:35.201372 CEST: Got context for toot https://fosstodon.org/@kernellogger/110271840371490826
2023-04-30 14:46:35.504862 CEST: Got context for toot None
2023-04-30 14:46:35.663660 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110271568500694100
2023-04-30 14:46:35.820605 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110271568474404934
2023-04-30 14:46:35.990844 CEST: Got context for toot https://chaos.social/@epicenter_works/110271503091349499
2023-04-30 14:46:36.225003 CEST: Got context for toot None
2023-04-30 14:46:36.827751 CEST: Got context for toot None
2023-04-30 14:46:37.160753 CEST: Got context for toot None
2023-04-30 14:46:37.710992 CEST: Got context for toot None
2023-04-30 14:46:38.168658 CEST: Got context for toot None
2023-04-30 14:46:39.177043 CEST: Got context for toot None
2023-04-30 14:46:39.370396 CEST: Got context for toot https://mastodon.social/@effinbirds/110271167468759563
2023-04-30 14:46:39.567885 CEST: Got context for toot https://infosec.exchange/@letsencrypt/110270999391462646
2023-04-30 14:46:40.125687 CEST: Got context for toot None
2023-04-30 14:46:40.344928 CEST: Got context for toot https://wien.rocks/users/JohannesWidi/statuses/110270944343541422/activity
2023-04-30 14:46:40.502230 CEST: Got context for toot None
2023-04-30 14:46:40.658200 CEST: Got context for toot https://die-partei.social/@rockpapershotgun/110270605071395699
2023-04-30 14:46:40.838678 CEST: Got context for toot None
2023-04-30 14:46:41.138410 CEST: Got context for toot https://ublog.tech/@IPngNetworks/110270337707336834
2023-04-30 14:46:41.425272 CEST: Got context for toot https://cloud-native.social/@saiyam/110270279065933045
2023-04-30 14:46:41.590241 CEST: Got context for toot https://social.anoxinon.de/@gnulinux/110270219670504945
2023-04-30 14:46:41.828350 CEST: Got context for toot None
2023-04-30 14:46:42.131893 CEST: Got context for toot None
2023-04-30 14:46:43.182835 CEST: Got context for toot https://mastodon.sdf.org/@stokesauce/110269800784497124
2023-04-30 14:46:43.516417 CEST: Got context for toot None
2023-04-30 14:46:43.779105 CEST: Got context for toot https://fosstodon.org/@PCzanik/110269557855094446
2023-04-30 14:46:43.932177 CEST: Got context for toot https://chaos.social/@epicenter_works/110269527900074364
2023-04-30 14:46:44.363302 CEST: Got context for toot https://toot.community/@chewyadupp/110269515436485879
2023-04-30 14:46:44.512277 CEST: Got context for toot https://social.anoxinon.de/@gnulinux/110269488322074211
2023-04-30 14:46:44.690105 CEST: Got context for toot None
2023-04-30 14:46:45.371004 CEST: Got context for toot https://hachyderm.io/@nova/110269367617217766
2023-04-30 14:46:45.566677 CEST: Got context for toot https://mastodon.social/@Gargron/110269333177557613
2023-04-30 14:46:46.183055 CEST: Got context for toot https://hachyderm.io/@nova/110269313981320765
2023-04-30 14:46:46.689686 CEST: Got context for toot https://hachyderm.io/@nova/110269297502294936
2023-04-30 14:46:47.175635 CEST: Got context for toot https://hachyderm.io/@nova/110269282105849043
2023-04-30 14:46:47.513118 CEST: Got context for toot https://inkdrop.space/@erosdiscordia/110269280229568362
2023-04-30 14:46:47.824437 CEST: Got context for toot https://inkdrop.space/@erosdiscordia/110269243245040071
2023-04-30 14:46:48.378198 CEST: Got context for toot https://hachyderm.io/@nova/110269220095556570
2023-04-30 14:46:48.606689 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110268968386952160
2023-04-30 14:46:48.786237 CEST: Got context for toot https://fosstodon.org/@kernellogger/110268967819554020
2023-04-30 14:46:49.318070 CEST: Got context for toot https://hachyderm.io/@nova/110268918980340570
2023-04-30 14:46:50.440705 CEST: Got context for toot https://hachyderm.io/@nova/110268918029141405
2023-04-30 14:46:51.465957 CEST: Got context for toot https://aus.social/@dgar/110268863255638200
2023-04-30 14:46:52.182279 CEST: Got context for toot None
2023-04-30 14:46:52.763132 CEST: Discovered redirect for URL https://mu.zaitcev.nu/objects/88c7f0cf-8490-42f6-98cb-357ccaf3cb99
2023-04-30 14:46:53.189954 CEST: Got context for toot None
2023-04-30 14:46:53.402683 CEST: Got context for toot https://mastodon.xyz/@xkcd/110268722552200407
2023-04-30 14:46:53.693705 CEST: Got context for toot None
2023-04-30 14:46:54.041336 CEST: Got context for toot None
2023-04-30 14:46:54.634407 CEST: Got context for toot None
2023-04-30 14:46:55.115542 CEST: Got context for toot https://mstdn.social/@feditips/110268251644500447
2023-04-30 14:46:55.447154 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110268095363254922
2023-04-30 14:46:55.841017 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110267962281429565
2023-04-30 14:46:56.283984 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110267960470755209
2023-04-30 14:46:57.152408 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110267958305946394
2023-04-30 14:46:57.755944 CEST: Got context for toot https://hachyderm.io/@molly0xfff/110267954441141833
2023-04-30 14:46:57.974467 CEST: Got context for toot https://mastodon.social/@effinbirds/110267864444585741
2023-04-30 14:46:59.122910 CEST: Got context for toot None
2023-04-30 14:47:00.339611 CEST: Got context for toot None
2023-04-30 14:47:00.751785 CEST: Got context for toot https://vis.social/@kristinHenry/110267385626309088
2023-04-30 14:47:01.039472 CEST: Got context for toot None
2023-04-30 14:47:01.256044 CEST: Got context for toot None
2023-04-30 14:47:02.127871 CEST: Got context for toot None
2023-04-30 14:47:03.023798 CEST: Got context for toot None
2023-04-30 14:47:03.237688 CEST: Got context for toot None
2023-04-30 14:47:03.237963 CEST: Error parsing toot URL https://anonsys.net/display/bf69967c-6664-495d-6e3a-a16077737418
2023-04-30 14:47:03.464355 CEST: Got context for toot None
2023-04-30 14:47:03.759425 CEST: Got context for toot None
2023-04-30 14:47:04.571270 CEST: Got context for toot None
2023-04-30 14:47:04.768953 CEST: Got context for toot None
2023-04-30 14:47:04.769156 CEST: Found 1859 known context toots
2023-04-30 14:47:05.253623 CEST: Added context url https://hachyderm.io/@stonebear/110269174273720816
2023-04-30 14:47:06.171568 CEST: Added context url https://mas.to/@franktaber/110268962862261293
2023-04-30 14:47:06.565568 CEST: Added context url https://mastodon.social/@pyperkub/110269038468994433
2023-04-30 14:47:09.080065 CEST: Added context url https://social.lol/@phils/110269234773434347
2023-04-30 14:47:09.548012 CEST: Added context url https://mastodon.art/@uddelhexe_/110269033565678512
2023-04-30 14:47:10.527234 CEST: Added context url https://hachyderm.io/@draNgNon/110269301944296433
2023-04-30 14:47:11.499143 CEST: Added context url https://infosec.exchange/@pauliehedron/110269181824790212
2023-04-30 14:47:11.911545 CEST: Added context url https://hachyderm.io/@nova/110269220095556570
2023-04-30 14:47:12.354845 CEST: Added context url https://hachyderm.io/@NireBryce/110269005285559581
2023-04-30 14:47:12.714733 CEST: Added context url https://chaos.social/@D_70WN/110269184839438600
2023-04-30 14:47:13.587585 CEST: Added context url https://social.anoxinon.de/@hias1234/110269099149667792
2023-04-30 14:47:14.650278 CEST: Added context url https://hachyderm.io/@pierrenick/110269269275955676
2023-04-30 14:47:15.024992 CEST: Added context url https://hachyderm.io/@nova/110269282105849043
2023-04-30 14:47:15.421080 CEST: Added context url https://hachyderm.io/@nova/110269297502294936
2023-04-30 14:47:15.957525 CEST: Added context url https://layer8.space/@tunda/110287819018434926
2023-04-30 14:47:17.046601 CEST: Added context url https://scicomm.xyz/@unchartedworlds/110269264471517167
2023-04-30 14:47:17.378024 CEST: Added context url https://chaos.social/@uschebit/110287732721818262
2023-04-30 14:47:17.934715 CEST: Added context url https://hachyderm.io/@thisismissem/110268958766590663
2023-04-30 14:47:18.890912 CEST: Added context url https://hachyderm.io/@log1kal/110269080698986186
2023-04-30 14:47:18.891350 CEST: Added 19 new context toots (with 0 failures)
2023-04-30 14:47:19.141591 CEST: Error getting user ID for user bba@digitalcourage.video: User bba was not found on server digitalcourage.video/video-channels.
2023-04-30 14:47:19.301204 CEST: Error getting user ID for user digitalcourage.de@digitalcourage.video: User digitalcourage.de was not found on server digitalcourage.video/accounts.
2023-04-30 14:47:19.436111 CEST: Error getting user ID for user digitalcourage.de@digitalcourage.video: User digitalcourage.de was not found on server digitalcourage.video/accounts.
2023-04-30 14:47:19.989980 CEST: Error getting user ID for user amy@types.pl: Error getting URL https://types.pl/api/v1/accounts/lookup?acct=amy. Status code: 401
2023-04-30 14:47:20.512800 CEST: Error getting user ID for user hypolite@friendica.mrpetovan.com: Expecting value: line 1 column 1 (char 0)
2023-04-30 14:47:20.864293 CEST: Error getting user ID for user transitionhausbt@venera.social: Expecting value: line 1 column 1 (char 0)
2023-04-30 14:47:21.850797 CEST: Error getting user ID for user smj@toobnix.org: User smj was not found on server toobnix.org/accounts.
2023-04-30 14:47:22.451513 CEST: Job failed after 0:02:12.227224.
Traceback (most recent call last):
  File "/app/find_posts.py", line 878, in <module>
    add_user_posts(arguments.server, token, filter_known_users(mentioned_users, all_known_users), recently_checked_users, all_known_users, seen_urls)
  File "/app/find_posts.py", line 75, in add_user_posts
    if post['reblog'] == None and post['url'] != None and post['url'] not in seen_urls:
                                  ~~~~^^^^^^^
KeyError: 'url'
nanos commented 1 year ago

Thanks @b2cc

Can I ask what software you are running on your server? Is yours a Mastodon server, or another Fediverse software?

b2cc commented 1 year ago

@nanos : That was on Mastodon 4.1.2. Sorry, should have mentioned that in the beginning.

nanos commented 1 year ago

Thanks @b2cc

Can you check out the logging branch please, and re-run this. Then please provide another log, so I can check it out.

nanos commented 1 year ago

@b2cc Have you had any luck?

b2cc commented 1 year ago

@nanos : sorry, I'm currently traveling abroad and haven't had much time, and the error isn't occurring anymore currently. Also I'm running fedifetcher from a k8s cronjob, so I have to create a local setup with the logging branch first to get meaningful logs. I'll keep you posted as soon as I have something.

mattlehrer commented 1 year ago

I'm having the same issue. I don't know much python. Is it possible to put that post['url'] line in a try/except so that the script can continue on error?

I am running Hometown 1.1.1/Mastodon 4.0.4

Edit: I see that's exactly what you have on the logging branch. I will try to use that.

...
2023-06-01 12:14:08.208949 UTC: Added 5 new context toots (with 0 failures)
2023-06-01 12:14:08.716490 UTC: Added context url https://mastodon.ie/@oisin/110435054654347819
2023-06-01 12:14:09.328837 UTC: Got context for toot https://mastodon.ie/@oisin/110435054654347819
2023-06-01 12:14:09.328906 UTC: Found 1 known context toots
2023-06-01 12:14:09.838129 UTC: Added context url https://mastodon.ie/@davey_cakes/110434478837185550
2023-06-01 12:14:09.838434 UTC: Added 1 new context toots (with 0 failures)
2023-06-01 12:14:10.348901 UTC: Added context url https://mastodon.ie/@oisin/110434164792279434
2023-06-01 12:14:10.787037 UTC: Got context for toot https://mastodon.ie/@oisin/110434164792279434
2023-06-01 12:14:10.787106 UTC: Found 1 known context toots
2023-06-01 12:14:11.251085 UTC: Added context url https://wandering.shop/@cstross/110430471006309334
Traceback (most recent call last):
2023-06-01 12:14:11.251149 UTC: Added 1 new context toots (with 0 failures)
  File "/home/runner/work/mastodon_get_replies/mastodon_get_replies/find_posts.py", line 878, in <module>
2023-06-01 12:14:11.762091 UTC: Added context url https://mastodon.ie/@oisin/110434135460840287
    add_user_posts(arguments.server, token, filter_known_users(mentioned_users, all_known_users), recently_checked_users, all_known_users, seen_urls)
2023-06-01 12:14:12.309015 UTC: Got context for toot https://mastodon.ie/@oisin/110434135460840287
2023-06-01 12:14:12.309086 UTC: Found 2 known context toots
  File "/home/runner/work/mastodon_get_replies/mastodon_get_replies/find_posts.py", line 75, in add_user_posts
2023-06-01 12:14:12.603916 UTC: Added context url https://mastodon.social/@macpsych/110434505925631705
    if post['reblog'] == None and post['url'] != None and post['url'] not in seen_urls:
KeyError: 'url'
2023-06-01 12:14:12.964891 UTC: Added context url https://mastodon.social/@macpsych/110430378845[538](https://github.com/mattlehrer/mastodon_get_replies/actions/runs/5144126856/jobs/9260021931#step:8:539)628
2023-06-01 12:14:12.964957 UTC: Added 2 new context toots (with 0 failures)
2023-06-01 12:14:12.964972 UTC: Added 21 posts for user oisin@mastodon.ie with 0 errors
2023-06-01 12:14:17.831918 UTC: Job failed after 0:04:30.976265.
Error: Process completed with exit code 1.
nanos commented 1 year ago

I'm kinda glad you have been able to reproduce this.

I'd be grateful if you could send in the log from the logging branch. Hopefully that'll tell me more.

mattlehrer commented 1 year ago

Well, I don't have the best news, in some ways. I upgraded to the latest Hometown version right before this started failing, so there's probably some change from 4.0.2 to 4.0.4 that causes this. Every run on GitHub Actions failed after that.

I cloned the repo to my server and ran the script there. Unfortunately, it worked perfectly while I had the output going to stdout. It had a lot to catch up on and ran for a long time (the GH actions jobs were all failing around 9 - 11 minutes, the script ran for probably 35 minutes before I killed it thinking I should be saving the logs).

I then ran it again with the output piped to a file. The line Error adding post does not appear in those logs. So I guess I missed that exception. And now the GH Actions runs are working again. So it must have been a rare toot circumstance that caused it.

I do think the try/except block is more likely the thing that fixed the problem vs the problem toot being far enough back in the timeline that it didn't get the FediFetcher treatment. If the GH Actions runs start failing again, I will make sure to catch the logs.

nanos commented 1 year ago

Hm. Very strange. It does appear to only happen in a very rare edge case, based on the fact that there have only been very few reports of this so far.

I'd love to track this down!

Teqed commented 1 year ago
    if post['reblog'] == None and post['url'] != None and post['url'] not in seen_urls:
KeyError: 'url'

This particular exception will be solved by e290f2c (#56) by replacing post['url'] with post.get('url') which will return None if the property does not exist in the dictionary, instead of producing a KeyError.

nanos commented 1 year ago

Thanks so much @Teqed! I'm closing this as fixed with the release of v6.0.0