robertoszek / pleroma-bot

Bot for mirroring one or multiple Twitter accounts in Pleroma/Mastodon/Misskey.
https://robertoszek.github.io/pleroma-bot
MIT License
103 stars 19 forks source link

Add RSS feeds as a source #74

Closed Ikun886-cxk closed 1 year ago

Ikun886-cxk commented 2 years ago

Twitter API is too difficult to apply for, I wonder if I can migrate tweets through RSS. I found a project : https://github.com/DIYgod/RSSHub It would be much easier if it could be migrated through RSS.

robertoszek commented 2 years ago

I don't know if I'd agree with the Twitter API application process being too difficult, it was painless in my case, but I see how it would be very convenient to point the bot to an RSS feed as a source.

However, regarding the tweets to RSS feed conversion, I'd argue is out-of-scope for this project. As an alternative to get an RSS feed for a specific Twitter account, I guess you could try using a nitter instance. Example feed: https://nitter.42l.fr/Twitter/rss

It seems to be capped to the last 20 though, and you would have to take into account the rate limits for hitting the nitter instance, of course.

I think it makes sense to add RSS feeds as a possible source for the bot but we'll see from there.

Ikun886-cxk commented 2 years ago

Thank you very much. I've found other projects to replace. https://github.com/mashirozx/tweet2toot

robertoszek commented 2 years ago

@SurpriseLon Cool, nice find! 👍

Regarding pleroma-bot, I think it still is a good idea to support RSS feeds, so we'll implement it in future releases anyhow.

nn81 commented 2 years ago

Hello, I'm also having difficulty to get a twitter token. My account is really old and some problem with older apps that does not permit to generate tokens.

It would simplify a lot to just use RSS feeds from nitter. Thanks.

robertoszek commented 2 years ago

Hey, just wanted to point out there has been some progress on this feature.

I published an experimental release candidate version (1.1.1rc7) that implements preliminary support for RSS feeds. Feel free to test it and provide feedback if it fails in unexpected ways, you can install it by running: pip install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.1.1rc7

It's a bit rough around the edges, so far I've only tested it with nitter RSS feeds and RSSHub feeds:

A pleroma-bot config with an RSS feed user would look like this:

pleroma_base_url: https://fedi.instance
users:
- twitter_username: github
  pleroma_username: test_user
  rss: https://rsshub.app/twitter/user/github/count=100
  max_tweets: 40
  pleroma_token: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
nn81 commented 2 years ago

Thanks, this is fantastic.

I've installed the mentioned version per your instructions and configured the yaml file accordingly.

There is this error coming up:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/pleroma_bot/cli.py", line 454, in main
    user = User(user_item, config, base_path)
  File "/usr/local/lib/python3.8/dist-packages/pleroma_bot/cli.py", line 204, in __init__
    self.pinned_tweet_id = self._get_pinned_tweet_id()
  File "/usr/local/lib/python3.8/dist-packages/pleroma_bot/_pin.py", line 253, in _get_pinned_tweet_id
    response.raise_for_status()
  File "/usr/local/lib/python3.8/dist-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://api.twitter.com/2/users/by/username/brito_fdx?user.fields=pinned_tweet_id&expansions=pinned_tweet_id&tweet.fields=entities

Is it still trying to access the twitter.com API for the pinned tweet?

It wouldn't really matter to pin the tweet on both platforms, just basic sync of new original tweets would suffice.

Many thanks.

robertoszek commented 2 years ago

That's strange, it shouldn't try to get the pinned tweet if the user has the rss mapping. Can you run:

pleroma-bot --version

And verify that it returns 1.1.1rc7? I don't seem to be able to replicate the issue and I'm finding the error message a bit confusing, line 454 in cli.py shouldn't be user = User(user_item, config, base_path) on 1.1.1rc7 but the error message says otherwise 😅

Depending on how your system is set up you may need to run pip3 intead of pip when installing it python packages: pip3 install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.1.1rc7 Oh, and make sure to activate your virtualenv if you're using one.

nn81 commented 2 years ago

Thanks for the quick reply and testing.

Indeed was running a previous version. I've ran the pip3 command and now it shows the expected version. My apologies as I don't interact much with Python, virtualenv is new to me.

I've activated virtualenv on the folder but still with an error message:

ℹ 2022-09-30 22:49:22,545 - pleroma_bot - INFO - config path: /home/brito/pleroma/config.yml 
ℹ 2022-09-30 22:49:22,545 - pleroma_bot - INFO - tweets temp folder: /home/brito/pleroma/tweets 
ℹ 2022-09-30 22:49:22,546 - pleroma_bot - INFO - ====================================== 
ℹ 2022-09-30 22:49:22,546 - pleroma_bot - INFO - Processing user:       Brito 
✖ 2022-09-30 22:49:23,085 - pleroma_bot - ERROR - Exception occurred for user, skipping... (cli.py:592) 
Traceback (most recent call last):
  File "/home/brito/pleroma/pleroma/lib/python3.8/site-packages/pleroma_bot/cli.py", line 460, in main
    user = User(user_item, config, base_path)
  File "/home/brito/pleroma/pleroma/lib/python3.8/site-packages/pleroma_bot/cli.py", line 228, in __init__
    self._get_twitter_info()
  File "/home/brito/pleroma/pleroma/lib/python3.8/site-packages/pleroma_bot/_twitter.py", line 95, in _get_twitter_info
    response.raise_for_status()
  File "/home/brito/pleroma/pleroma/lib/python3.8/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://api.twitter.com/2/users/by/username/brito_fdx?user.fields=created_at%2Cdescription%2Centities%2Cid%2Clocation%2Cname%2Cpinned_tweet_id%2Cprofile_image_url%2Cprotected%2Curl%2Cusername%2Cverified%2Cwithheld&expansions=pinned_tweet_id&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Cpossibly_sensitive%2Creferenced_tweets%2Csource%2Ctext%2Cwithheld
robertoszek commented 2 years ago

No worries! That error message makes me think you're now in 1.1.1rc7 👍 It looks to me that for some reason the rss mapping is evaluating to False or None.

Would you mind sharing your config YAML file? (removing any sensitive info like tokens and so on) I think that may be the culprit here.

nn81 commented 2 years ago

Sure. Thanks for looking into this.

Here it is:

pleroma_base_url: https://pleroma.pt
users:
- twitter_username: brito_fdx
  pleroma_username: brito
  rss: https://rsshub.app/twitter/user/brito_fdx/count=100
  max_tweets: 40
  pleroma_token: blablabla
robertoszek commented 2 years ago

Interesting, I've had no luck replicating the issue with that config so far. Could you check if there are any residual folders from previous runs (called users or tweets) and remove them before trying again?

And just to confirm, are you passing the path of the config file to the bot as an argument?:

pleroma-bot --config <path to config YAML file>
nn81 commented 2 years ago

No dice.

Reading your feedback I've noted it might something wrong on the machine itself, so I've tried to install and run on my laptop.

From there I can confirm it is working. I guess there is just something already messed up on the python environment there.

robertoszek commented 2 years ago

Oh, I see! The plot thickens 😅 Glad you got it working on your laptop but I'm curious what's wrong with your first machine. Any luck with creating a new virtual environment from scratch on there?:

python3 -m venv new
source new/bin/activate
pip3 install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.1.1rc7
pleroma-bot --config /path/to/config.yml
nn81 commented 2 years ago

Magic!

Now it works good on the raspberry pi too. I've also deleted completely the folder, very likely that something there was causing this confusion.

This bot is fantastic. Thank you for everything kind sir.

robertoszek commented 2 years ago

Awesome! Good to hear we got there in the end 💪

Now that you got it working, let me know if you run into anything weird or any unintended behavior when posting from RSS. It's still highly experimental after all, I gotta cleanup the code and write coverage tests for it before even considering merging this feature into the stable branch. Cheers!

nn81 commented 2 years ago

I've looked into the posts from RSS on pleroma and some of the publications were cut without the full text. The culprit was the https://rsshub.app recommended on the default configuration. If you notice on the section of the tweets, some of them are cutting the messages. You can identify them with the "..." at the end of the message.</p> <p>So it wasn't an issue on pleroma-bot, but a matter of changing the RSS provider. I've changed this on the default configuration to nitter. For example: <a href="https://nitter.net/github/rss">https://nitter.net/github/rss</a></p> <p>The other part was getting the retweets and replies posted by default, they just looked out of context when published on pleroma but this was relatively fast to disable (good documentation btw), that was the only thing odd but no biggie.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/robertoszek"><img src="https://avatars.githubusercontent.com/u/1080963?v=4" />robertoszek</a> commented <strong> 2 years ago</strong> </div> <div class="markdown-body"> <p>Thanks for the feedback! Ahh I see where it may have gone wrong there. Were the cut off posts originally retweets and replies by any chance? Or were they just normal tweets?</p> <p>In some cases the RSS providers only included the "<code>RT @whoever</code>" or "<code>Re to @someone</code>" text in the title itself and not the summary body for some reason. So I had some logic in there to handle those cases but looks like I didn't account for title truncation 😅</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/nn81"><img src="https://avatars.githubusercontent.com/u/8041061?v=4" />nn81</a> commented <strong> 2 years ago</strong> </div> <div class="markdown-body"> <p>Was checking and normal tweets were also cut, unfortunately.</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/robertoszek"><img src="https://avatars.githubusercontent.com/u/1080963?v=4" />robertoszek</a> commented <strong> 2 years ago</strong> </div> <div class="markdown-body"> <p>Gotcha, I made some changes and published 1.1.1rc8, give it a try if you have the chance sometime and report back if you encounter any truncated posts again so I can investigate further:</p> <pre><code>pip3 install -i https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pleroma-bot==1.1.1rc8</code></pre> <p>Cheers!</p> </div> </div> <div class="comment"> <div class="user"> <a rel="noreferrer nofollow" target="_blank" href="https://github.com/nn81"><img src="https://avatars.githubusercontent.com/u/8041061?v=4" />nn81</a> commented <strong> 2 years ago</strong> </div> <div class="markdown-body"> <p>Working good, thanks.</p> <p>Feels great keeping pleroma alive whenever also posting content on twitter.</p> </div> </div> <div class="page-bar-simple"> </div> <div class="footer"> <ul class="body"> <li>© <script> document.write(new Date().getFullYear()) </script> Githubissues.</li> <li>Githubissues is a development platform for aggregating issues.</li> </ul> </div> <script src="https://cdn.jsdelivr.net/npm/jquery@3.5.1/dist/jquery.min.js"></script> <script src="/githubissues/assets/js.js"></script> <script src="/githubissues/assets/markdown.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/highlight.min.js"></script> <script src="https://cdn.jsdelivr.net/gh/highlightjs/cdn-release@11.4.0/build/languages/go.min.js"></script> <script> hljs.highlightAll(); </script> </body> </html>