Add a function to dump all data stored in a CDP Firestore database to a local sqlite file.
Use Case
A lot more people know how to use SQL than our weird combination of Firestore + ORM in Python. And for tabular data (related to voting, legislation, people info, etc.), SQL is likely the best choice for quick and easy processing. There are also visualization engines that can read sqlite I believe??
Solution
Add a function to the library (prototype fine for now) that takes in the CDP Instance name the user wants to create a CDP sqlite file for and the filepath / filename for where to dump the data to something like:
Feature Description
Add a function to dump all data stored in a CDP Firestore database to a local
sqlite
file.Use Case
A lot more people know how to use SQL than our weird combination of Firestore + ORM in Python. And for tabular data (related to voting, legislation, people info, etc.), SQL is likely the best choice for quick and easy processing. There are also visualization engines that can read sqlite I believe??
Solution
Add a function to the library (prototype fine for now) that takes in the CDP Instance name the user wants to create a CDP sqlite file for and the filepath / filename for where to dump the data to something like:
That takes iteratively goes through each collection and requests data in batches from Firestore and writes in batches to the sqlite file.
Notes
I assume the database models themselves should stay the same: schema-diagram & model-docs
Since we use
FireO
for our "Firestore ORM" -- their docs on querying data (including batched) are likely important: https://octabyte.io/FireO/querying-dataAn example of using the FireO models can be seen in this notebook or in our source code
I say just use the sqlite3 library that ships with Python?