ropensci / unconf14

Repo to brainstorm ideas (unconference style) for the rOpenSci hackathon.
28 stars 3 forks source link

Text mining #4

Open sckott opened 10 years ago

sckott commented 10 years ago

We have a number of packages that provide an interface to either metadata or full text of academic articles. And there are other packages outside of ropensci to leverage.

However, we haven't done much work on these packages. They

sckott commented 10 years ago

Seems like there's no interest here, speak up if you are

cpsievert commented 10 years ago

I'm interested (especially in the full text of academic articles). I've been working on a tool to visualize and help interpret output from a topic model fit via Latent Dirichlet Allocation.

I think I could contribute to the "need use cases to demonstrate what can be done with them" point with something similar to what I did for xkcd. The end result could help us understand what topics are being discussed as well as trends over time. Since this is a hot topic (no pun intended :), it might be cool to make some related rMaps (if we can link geographical data to the articles).

sckott commented 10 years ago

@cpsievert Cool! Yeah, use cases would be great, and during that process surely you'll discover bugs/come up with things we can do better. Maybe a guest blog post on our blog too? What we have:

cpsievert commented 10 years ago

Great! @karthik also mentioned he would like to work done on arXiv, but fitting a topic model to eLife articles might be more relevant for a blog post, right?

sckott commented 10 years ago

I don't know. Thoughts @karthik

karthik commented 10 years ago

I think I could contribute to the "need use cases to demonstrate what can be done with them" point with something similar to what I did for xkcd. The end result could help us understand what topics are being discussed as well as trends over time. Since this is a hot topic (no pun intended :), it might be cool to make some related rMaps (if we can link geographical data to the articles).

Sounds great. Working on arxiv would be great time permitting. But if you want to work on the elife thing, that's great too.

benmarwick commented 10 years ago

@sckott @cpsievert I have a package for working with JSTOR's Data for Research service here: https://github.com/benmarwick/JSTORr

Ironholds commented 10 years ago

Could be fun.

szeitlin commented 10 years ago

Sounds useful. And kind of similar to somethings I've used a lot, e.g. iHOP? http://www.ihop-net.org/UniPub/iHOP/