Closed nemobis closed 2 years ago
I think it has something to do with the threading library. It's waiting for something that never comes.
Processing tweets... |^CTraceback (most recent call last):
File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/cli.py", line 515, in main
tweets_to_post = process_parallel(tweets, user, threads)
File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/_utils.py", line 111, in process_parallel
ret = mp.wait() # get all results
File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/_utils.py", line 83, in wrapper
spinner_thread.join()
File "/usr/lib/python3.7/threading.py", line 1032, in join
self._wait_for_tstate_lock()
File "/usr/lib/python3.7/threading.py", line 1048, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/7/federico/mastodon/bin/pleroma-bot", line 10, in <module>
sys.exit(init())
File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/cli.py", line 594, in init
sys.exit(main())
File "/home/7/federico/mastodon/lib/python3.7/site-packages/pleroma_bot/cli.py", line 515, in main
tweets_to_post = process_parallel(tweets, user, threads)
KeyboardInterrupt
Now run in verbose mode... wth?
DEBUG:urllib3.connectionpool:https://twitter.com:443 "HEAD /Mariia_Zolkina/status/1510323171815264258/photo/1 HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:https://twitter.com:443 "HEAD /DefenceU/status/1510372750665433089 HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://twitter.com:443 "HEAD /IAPonomarenko/status/1510379221679542273/photo/1 HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:https://twitter.com:443 "HEAD /JulianRoepcke/status/1510345706250416136/photo/1 HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:https://twitter.com:443 "HEAD /GissiSim/status/1510381911302098944/video/1 HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:https://twitter.com:443 "HEAD /visegrad24/status/1510621810936721415 HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510344076838715399?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510339436617342984?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510370057855156229?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510372750665433089?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510323171815264258?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510319120935440390?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510659351261892614?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510379221679542273?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510345706250416136?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510373338778066946?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510651739942363138?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 429 94
I guess it doesn't help that my machine has 48 cores, given:
cli.py
514: threads = round(cores / 2 if cores > 4 else 4)
515: tweets_to_post = process_parallel(tweets, user, threads)
Perhaps some limit here would be in order. Or parallelism could be reduced when hitting HTTP 429?
Yeah, I've only tested the bot on more modest machines so far 😅 Should we close this one as a duplicate of #72 ? The root cause of this basically is the unhandled HTTP 429 responses, if those are handled, the higher number of cores or additional parallel requests should not be an issue.
Il 05/04/22 19:54, robertoszek ha scritto:
Should we close this one as a duplicate of #72 ?
Yes, I think that's reasonable. I managed to work around this by tweaking other configs to reduce the number of posts.
Sometimes it seems to get stuck on SSL handshake or something, but I'm not sure:
INFO:pleroma_bot:tweets gathered: 6 DEBUG:pleroma_bot:1510541472306712579 Processing tweets... |DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443 DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /1.1/statuses/show.json?id=1510541472306712579 HTTP/1.1" 200 1117 DEBUG:pleroma_bot:1510603710950522885 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443 DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1510518074319024129?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 200 1300 DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): t.co:443
Ok, closing for now as a duplicate of #72 If after handling 429's still gets stuck I'll investigate further and reopen this issue. Thanks!
Yes. If this still happens I think an easy first step is to change this line https://github.com/robertoszek/pleroma-bot/blob/a26b5cfa567f92bab1aa0f63cb985c63f0dfe72d/pleroma_bot/_processing.py#L434 from session.head(group, allow_redirects=True)
to something like session.head(group, allow_redirects=True, timeout=(7, 10))
to see if a stricter timeout helps with cases like this:
ℹ 2022-04-05 21:15:54,606 - pleroma_bot - INFO - tweets gathered: 2
INFO:pleroma_bot:tweets gathered: 2
DEBUG:pleroma_bot:1511251315871166468
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.twitter.com:443
Processing tweets... |DEBUG:urllib3.connectionpool:https://api.twitter.com:443 "GET /2/tweets/1511246333511938048?poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&media.fields=duration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&expansions=attachments.poll_ids%2Cattachments.media_keys%2Cauthor_id%2Centities.mentions.username%2Cgeo.place_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld HTTP/1.1" 200 1125
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): t.co:443
DEBUG:urllib3.connectionpool:https://t.co:443 "HEAD /GtotlL2HJX HTTP/1.1" 301 0
For mysterious reasons the tweet https://nitter.eu/WorkersEESC/status/1511246333511938048#m failed multiple times, seemingly on the expansion of https://t.co/GtotlL2HJX to http://eutoday.net/news/security-defence/2022/extreme-political-groups-endangering-the-health-of-our-democracy-says-jos%C3%A9-antonio-moreno-d%C3%ADaz
For mysterious reasons the tweet https://nitter.eu/WorkersEESC/status/1511246333511938048#m failed multiple times, seemingly on the expansion of https://t.co/GtotlL2HJX to http://eutoday.net/news/security-defence/2022/extreme-political-groups-endangering-the-health-of-our-democracy-says-jos%C3%A9-antonio-moreno-d%C3%ADaz
@nemobis Looks like it gets a bit confused about the encoding: https://github.com/psf/requests/blob/2d5517682b3b38547634d153cea43d48fbc8cdb5/requests/sessions.py#L95
I implemented a best effort exception handler on 1.0.3rc10
in case it runs into latin1-encoded redirect headers with non-ASCII characters :
00e75d4b124aca686fed51b5e362b176d6477e90
pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.0.3rc10
At least it seems to work with the example you provided, let me know if you run into any others.
I'm playing with this and on 1.1.0 this still seems to be an issue :( This is my first attempt at using pleroma-bot, and after getting all the keys etc. installed, it gets to "processing tweets" and then just stops. In verbose mode, I see the same 429 response from Twitter as others in this thread. By the time it stops it has downloaded about 140 or so media files. I left it overnight...
Then I saw the comments about thread count, and manually set the threads to 4 in _processing.py. After having done this, everything worked fine and all is hunky-dory.
Maybe set the threads thing to a low number by default? At least easier than trying to debug on a 64-way machine like the one I'm running this on, I presume.. :)
Il 05/11/22 01:34, Eirik Øverby ha scritto:
Maybe set the threads thing to a low number by default?
Yes, I think this is a reasonable suggestion. I also still get stuck all
the time, my workaround is to use timeout
to kill the process and
restart it.
Sometimes this happens:
This was stuck for 45 min there. Maybe there's a need of a hard timeout somewhere.
The error seems to be somehow user-dependent, in that I've seen it more than once for some users.