Irys-xyz / ARchivers

10 stars 6 forks source link

Archive Weibo content #1

Open fewwwww opened 2 years ago

fewwwww commented 2 years ago

For the archiving of Weibo, it looks like Weibo is not open as Twitter, and it only offers a small number of APIs. I assume it will be hard to use these APIs or make a web crawler.

I think it will be a great idea to parse the accounts on Twitter that manually repost some of the contents from platforms of Weibo and WeChat, such as @weibo_read and @TGTM_Official.

Their contents may somehow be "biased" since they manually select them before posting. But the parsing of them should be very easy. It just need to change the config file for Twitter archive by adding their user id into it, and archive all the contents they post.

JesseTheRobot commented 2 years ago

I too ran into similar API issues (not to mention the effort required to create and verify an account - it took a lot longer than it should've). I've had a look at the accounts you mentioned and the ones like it and I agree they're a good fit and will probably be integrated into the currently running archiver - but I'm not a deciding factor when it comes to the bounty itself so I can't provide comment on that specifically. That being said I will still be trying to create the integration - it will just probably not be as effective as the twitter one.