Do we need MongoDB? - Githubissues

Should we be using MongoDB at the end of our pipeline? @tcrick has some ideas about using it to store pipeline runs. I can see the value of this for sure, and while using parquet is awesome, the whole "experiment" / analysis setup I've got in place right now with the curry package is probably going to be inefficient when scaled up to the full reddit dataset. @tcrick, can you discuss this with Romina when I'm on leave? We can chat about whether, and if so where, to use MongoDB.

mclevey / podlm

Do we need MongoDB? #31