DistrictDataLabs / baleen

An automated ingestion service for blogs to construct a corpus for NLP research.
MIT License
86 stars 38 forks source link

Formalize Mongo Schema #75

Open will2041 opened 7 years ago

will2041 commented 7 years ago

Taking some lessons from Steven Lott's PyData presentation: http://pydata.org/dc2016/schedule/presentation/40/

https://twitter.com/s_lott

https://slott56.github.io/no-sql-doesnt-mean-no-schema/assets/player/KeynoteDHTMLPlayer.html#0

We can formalize the Mongo schemas using JSON and relying on JSON validation to ensure that we never even instantiate an object that doesn't match our defined schema.