aigents / aigents-java

Aigents Java Core Platform
MIT License
30 stars 12 forks source link

RSS support #5

Closed akolonin closed 4 years ago

akolonin commented 4 years ago

Need to provide RSS channels support like it is done for Reddit subreddits and user activity logs and will be done for Twitter (#4 )

For entry point, you will need new class RSSeer - see: https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/self/Siter.java#L295 https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/comm/reddit/Reddit.java#L99 https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/comm/reddit/Reddit.java#L169

For file reading and content type checking - look up https://github.com/aigents/aigents-java/blob/master/src/main/java/net/webstructor/self/Cacher.java#L118 lines 118-123

A) reader.allowedForRobots(path) and if allowed B) Use reader.canReadDocContext(path,context) or reader.readDocData(path," ",context) or something like that to 1) check if file is either RSS or Atom AND if so 2) process RSS/Atom items one by one

Support both: https://sawv.org/2019/11/12/rss-vs-atom-vs-json-feed-vs-hfeed-vs-whatever.html https://www.saksoft.com/rss-vs-atom/ https://problogger.com/rss-vs-atom-whats-the-big-deal/

RSS Feed Example: https://www.feedforall.com/sample.xml


<?xml version="1.0" encoding="windows-1252"?>
--
  | <rss version="2.0">
  | <channel>
  | <title>FeedForAll Sample Feed</title>
  | <description>RSS is a fascinating technology. The uses for RSS are expanding daily. Take a closer look at how various industries are using the benefits of RSS in their businesses.</description>
  | <link>http://www.feedforall.com/industry-solutions.htm</link>
  | <category domain="www.dmoz.com">Computers/Software/Internet/Site Management/Content Management</category>
  | <copyright>Copyright 2004 NotePage, Inc.</copyright>
  | <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  | <language>en-us</language>
  | <lastBuildDate>Tue, 19 Oct 2004 13:39:14 -0400</lastBuildDate>
  | <managingEditor>marketing@feedforall.com</managingEditor>
  | <pubDate>Tue, 19 Oct 2004 13:38:55 -0400</pubDate>
  | <webMaster>webmaster@feedforall.com</webMaster>
  | <generator>FeedForAll Beta1 (0.0.1.8)</generator>
  | <image>
  | <url>http://www.feedforall.com/ffalogo48x48.gif</url>
  | <title>FeedForAll Sample Feed</title>
  | <link>http://www.feedforall.com/industry-solutions.htm</link>
  | <description>FeedForAll Sample Feed</description>
  | <width>48</width>
  | <height>48</height>
  | </image>
  | <item>

Atom Feed Example: https://validator.w3.org/feed/docs/atom.html

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Example Feed</title>
  <link href="http://example.org/"/>
  <updated>2003-12-13T18:30:02Z</updated>
  <author>
    <name>John Doe</name>
  </author>
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>

  <entry>
    <title>Atom-Powered Robots Run Amok</title>
    <link href="http://example.org/2003/12/13/atom03"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <summary>Some text.</summary>
  </entry>

</feed>

Use XML: https://www.viralpatel.net/java-xml-xpath-tutorial-parse-xml/

RSS test feeds: http://feeds.reuters.com/reuters/businessNews http://feeds.reuters.com/reuters/technologyNews http://feeds.reuters.com/reuters/politicsNews http://feeds.reuters.com/news/wealth https://blog.feedspot.com/bitcoin_rss_feeds/ https://blog.feedspot.com/reuters_rss_feeds/ https://gist.github.com/hamzamu/5c2fa2907ec507f4aba3ba6fcce2d21b