astrofrog / robo-ph

#dotastro hack
3 stars 2 forks source link

Ability to choose subtopics #30

Open jason-neal opened 9 years ago

jason-neal commented 9 years ago

It would be great to be able to choose which subtopics to include (or exclude) from astroph when generating personal streams. For example only articles from [astro-ph.EP] and [astro-ph.SR] or all except [astro-ph.IM].

Or one step further would be to include articles that contain specific keywords. E.g. include articles which include Radial Velocity, astroseismology or Black Holes. This could be then personally tuned to suit the research interests of the listener.

This is really neat project and have enjoyed listening so far. Jason

jegpeek commented 9 years ago

Is this easy? I think it's certainly possible to add this to repo at some level; parsing the XML to choose astro-ph.XX or to look for keywords. More fancily, one could try to hook up to some more clever learning tool, or directly to your voxcharta account or something. But AFAICT, there isn't an obvious way to do this without setting up an entirely different podcast for each person! @jason-neal, would you be interested in trying to follow the instructions on this site and set up your own version of robo-ph, with your own URL for the RSS feed of the podcast? That would give us a sense of how hard it is to have a DIY robo-ph. Then we could think about customizing it.

jason-neal commented 9 years ago

@jegpeek By instructions do you just mean what is in the Readme. At this stage I am unable to run roboph due to having Linux, not OSx. I could try implement it with pyttsx which is platform independent tts but that will take some time. Any suggestions or am I missing something?

As for selecting different topics it should be fairly straight forward I think. I see two possible basic implementations

  1. Changing the ARXIV_URL by appending .EP .IM if requiring only a single subtopic or would have to loop over multiple URLs for getting articles from different subtopics.
  2. Use the article.subject() property (which I don't think is currently obtained) to include/exclude based on the subject value of each article. e.g.

allowed_topics = [EP, IM] if article.subject in allowed_topics:

code to do audio file...

else: continue

Some issues would be dealing with double ups if obtaining the same article from multiple streams. Or capturing all subject info for articles that are in more than one than topic. e.g. Stars and Planets