spgroup / groundhog

A framework for crawling GitHub projects and raw data and to extract metrics from them
http://spgroup.github.io/groundhog
GNU General Public License v2.0
15 stars 10 forks source link

Implement persistence functionality with MongoDB #63

Closed rodrigoalvesvieira closed 10 years ago

rodrigoalvesvieira commented 11 years ago

Groundhog currently suffers from memory loss..oh, no kidding. The framework/library does not yet support database persistence, which means that unless we write data to a file every single time we use the tool we will lose all the data we fetch.

This is bad because between a search and another, many things may not have changed and thus it is unnecessary to perform another full search from scratch, specially given our limitations with the GitHub API

Since our data is not expected to be very rigid in formats, we will use MongoDB as data store. The lack of schema of this database system seems to be optimal for our needs.

rodrigoalvesvieira commented 11 years ago

Found a great library to use! http://jongo.org/#jongo

rodrigoalvesvieira commented 10 years ago

The Jongo library will be replaced by another one, called Morphia 1, which turns out to be more reliable (maintained by the MongoDB official organization) and easier to use.

rodrigoalvesvieira commented 10 years ago

This was great! Great job everyone!