nightscout / cgm-remote-monitor

nightscout web monitor
GNU Affero General Public License v3.0
2.39k stars 71.63k forks source link

feature: support for custom storage drivers and SQLite storage implementation #7649

Open k2s opened 1 year ago

k2s commented 1 year ago

I created a prototype of Nightscout using SQLite as storage. SQLite supports JSON documents and also indexes on its properties, so it is not difficult to emulate MongoDB functionality as it is currently used in Nightscout.

I propose to add loader of custom storage into Nightscout with this commit. I am not creating pull request before some discussion on this topic.

I created mono repository to develop nightscout-storage-sqlite module. It is activated by setting STORAGE_CLASS=@nightscout-storage-sqlite.

I also developed CLI tool to import MongoDB dump into SQLite database and Docker image to test that it works. I am personally running it for some time now with imported 2 years of data .

More details and my open questions (mostly about proper way to provide configuration options) are in https://github.com/k2s/nightscout/.

bewest commented 1 year ago

Very creative! There are a few observations to build on your ideas here. JSON support has gotten much better across SQL engines over the last 8 years, so this is very exciting. Historically, Nightscout has found success in what may now be considered to be an excess in permissivity. Which data types are available in which channel, and which properties or fields are available on each data type? The permissivity that comes intrinsic with MongoDB's storage model allows any application interoperating with Nightscout to determine their own idiomatic data model that may or may not match the other applications also using Nightscout. Most applications find a need to implement both AndroidAPS/openaps and Loop-style data models. There have been prior efforts to standardize the data model, although it only went so far as to cover CGM, not all possible treatments. Currently, Nightscout uses idiomatic mongodb, which prevents interoperable usage with CosmosDB. Documenting and standardizing the data that comes through Nightscout will allow us to decouple Nightscouts behavior from a single data storage engine and open up many more deployment options. In particular, I would be very excited about something using https://knexjs.org/ to potentially gain sql compatibility with several sql engines.

I see you've taken an approach to simplify the problem quite a bit more. knex supports a json data type and many sql engines support that as well. I appreciate the way you've identified the storage engine code. Is there a hybrid path that a.) simplifies work to be done while maintaining velocity, b.) opens up deployment scenarios c.) helps migrate towards stable data schema. I'm supportive of this kind of work, and working on compatible tangential ideas.

One consideration is with data storage, there has been some desire to use "capped collections" feature of Mongodb to implement a ring buffer that can run on a designated quota. Implementing a lossy data store is undesirable in some ways, except that some people will be able to run it at lower cost. With idiomatic Mongodb, this may be easier than with sql storage, or it might not make a difference. Just something in the current milieu to be aware of. From my conversations with @sulkaharo, I believe we are both supportive of making the data model stricter to better define interoperability as well as enabling a wider variety of high quality deployments.

k2s commented 1 year ago

I did some work on https://github.com/k2s/nightscout/:

All the code is still experimental.

Booth, sqlite and knex drivers, could be extended with json schema that would define table structure in relational DB and still store any nonstandard data in JSON field.

For cleaning up structure of data, NS could define JSON schema to validate incomming data on the level of REST API and still keep the idea of document or time series database, instead of relational.

Benefit of Knex driver seams to me questionable. I think that writing the own drivers for database engines would use a lot of shared code and could be more efficient than Knex. At this moment, it seems that Knex is not providing universal way how to use JSON datatype in a way we need (I could be wrong).

The proxy driver is a good tool to see how NS uses DB (all operations are logged). My impression is that it uses it too much and that most of the calls could be cached if NS is the only tool using the DB directly and most requests are related to data from short time period that doesn't occupy too much RAM. Cache should fall back to DB storage only for reports.

@bewest Thank you for commenting on my proposal. Please, what do you mean with Nightscout uses idiomatic mongodb, which prevents interoperable usage with CosmosDB. I started also driver for native support to CosmosDB, but then I stopped because it seems there is compatibility mode with MongoDB. There are also some tickets where successful usage of CosmosDB is mentioned.

Lowering the amount of access to DB engine and also providing single Docker image supporting file database could simplify deployment and lower the cost of hosting.

Using JSON to express filter and carry data as MongoDB defines is a good abstraction for communication with any kind of database. In my current understanding of NS operation, the challenge is to handle duplicate data from different sources, which I believe is solvable.

bewest commented 1 year ago

Thanks for all the notes, @k2s! Very impressive work, thank you!

Excellent notes on performance. There are a variety of reasons for what you observe, but I think @sulkaharo and I largely agree when it comes to the big picture. Sulka has also expressed desire for a db engine that can facilitate the needs better, presumably that would include keeping a cache. I hope he'll weigh in here because some of it sounds similar. For what Sulka has planned, we might opt for a fork instead of feature enhancements, for a variety of reasons. Any naming suggestions? We will be strongly encouraging functional programming style in a bid to reduce some of the unintended complexity among plugins.

Please think about and let me know what additional support might be useful, T1Pal may be able to sponsor workshops for developers along with other budget for this work. Are you located in Europe?