robertoszek / pleroma-bot

Bot for mirroring one or multiple Twitter accounts in Pleroma/Mastodon/Misskey.
https://robertoszek.github.io/pleroma-bot
MIT License
104 stars 18 forks source link

RSS import: Nitter shows links instead of mentions/hashtags and HTML tags #122

Open edel79 opened 1 year ago

edel79 commented 1 year ago

Hi, I have a problem using pleroma with a RSS feed. Here is the error encounterd : ERROR:pleroma_bot:Exception occurred for user, skipping... Traceback (most recent call last): File "/home/edelzone/virtualenv/python/pleroma/3.10/lib/python3.10/site-packages/pleroma_bot/cli.py", line 556, in main user = User(user_item, config, base_path, posts_ids) File "/home/edelzone/virtualenv/python/pleroma/3.10/lib/python3.10/site-packages/pleroma_bot/cli.py", line 289, in __init__ self.mastodon_enforce_limits() File "/home/edelzone/virtualenv/python/pleroma/3.10/lib/python3.10/site-packages/pleroma_bot/_utils.py", line 547, in mastodon_enforce_limits if len(self.display_name[t_user]) > 30: KeyError: 'transportsidf' With the same user, if I use another import method, it works. Problem with at least one other user. RSS feed tested from nitter and RSS-Bridge, with the same result. Any idea of what is wrong ? "transportsidf" is the account name of the twitter user to proceed.

edel79 commented 1 year ago

Well, fixed with this one : https://github.com/robertoszek/pleroma-bot/issues/119 Installing the latest version solved the problem. Sorry for that duplicate post.

edel79 commented 1 year ago

In fact I do have a problem. So I'm using Nitter as RSS Generator. Here is the problem :

ℹ 2023-02-03 21:02:01,869 - pleroma_bot - INFO - Processing user: 109801607888254816  ℹ 2023-02-03 21:02:03,941 - pleroma_bot - INFO - Gathering tweets...20  ℹ 2023-02-03 21:02:04,288 - pleroma_bot - INFO - tweets gathered: 1  ℹ 2023-02-03 21:02:04,288 - pleroma_bot - INFO - tweets to post: 1  ✖ 2023-02-03 21:02:04,288 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:721)  Traceback (most recent call last): File "/home/edelzone/virtualenv/python/pleroma/3.10/lib/python3.10/site-packages/pleroma_bot/cli.py", line 699, in main cw=tweet["cw"] KeyError: 'cw'

I have 3 accounts, all 3 accounts has this problem. If I switch to API import, it is working good. Do you have any idea of what is causing this issue ?

robertoszek commented 1 year ago

Hi! what version are you running now? You can double-check by running:

pleroma-bot --version
edel79 commented 1 year ago

I am using version 1.2.1rc18.

robertoszek commented 1 year ago

I am using version 1.2.1rc18.

Can you try running 1.2.1rc19 (like the issue you mentioned #119) and see if that is the problem?:

pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.2.1rc19
edel79 commented 1 year ago

Update done. I now have to wait for the next update attempt to see if it solves the problem. I'll keep you informed.

edel79 commented 1 year ago

So, it is now working but not as good as I wish. See that toot : https://piaille.fr/@Francois_Jobard@techhub.social/109805144529192489 compared to the original tweet on Twitter : https://twitter.com/Francois_Jobard/status/1621762492614443008 First, when a twitter account is mentionned, it is replaced with the full URL of that account on Nitter. It would be better to keep the full url as a link, but to replace the full url with the twitter format : @account. Same thing for a hashtag, on displaying only #hashtag instead of the full link. Also in both cases, theses links are currently displayed alone on a signe line, though they should not (see the tweet to compare). Second, it woulb be great to offer a preview when a status of someone else is mentionned. If not possible, at least hide the <p> and </p> tag, and that time display the link of the status mentionned on a single line. And at last, I still have the time offset problem I have using the API, mentionned here : https://github.com/robertoszek/pleroma-bot/issues/121

robertoszek commented 1 year ago

What value are you using on your config for that user as the rss mapping?

I'd like to compare what the RSS feed contents are for that tweet.

edel79 commented 1 year ago

I use this :
rss: https://nitter.inpt.fr/Francois_Jobard/rss

edel79 commented 1 year ago

FYI, I changed the RSS source to RSSHub, it is cleaner and at least I don't have problem anymore with hashtag display. I'm still waiting to see what it will look like with mentions.

edel79 commented 1 year ago

I have just switched back to API usage and I think I will remain, while API is working. RSS Import is less reliable, tweets are not always sent to Mastodon, and thread management is better with the API : a thread in Twitter remains a thread in Mastodon. With RSS import, a thread in Twitter becomes separate toots in Mastodon.

robertoszek commented 1 year ago

Yeah, after checking the contents of the Nitter RSS the links you mentioned for hashtags and user mentions appear just like that on the body of the item. Perhaps we can process them by applying some regular expressions to replace them for proper mentions and hashtags. Same thing for <p> and </p>, we definitely need to transform those into newlines or something.

Regarding thread management, there's not much we can do on that side. RSS feeds don't include enough info to know what tweet the original tweet was replying to, so threads are simply a no-go.

I'll change the title of this issue to reflect the Nitter RSS feeds improvements needed.

dawnerd commented 1 year ago

With the api officially dead can we look at this again? I switched over to nitter rss and having this issue too but my nitter is behind an ip whitelist so the links are broken.

I can of course provide some funding if that helps