atom / telemetry

sends usage metrics to GitHub's internal analytics pipeline
MIT License
11 stars 12 forks source link

Change the storage system to maintain metrics between sessions #26

Closed rafeca closed 5 years ago

rafeca commented 5 years ago

From https://github.com/atom/telemetry/pull/25, we need to move away from lokijs and change the storage for events not yet sent on the telemetry package:

After looking at this I just got curious about lokijs and how it implemented the storage of events and from what I saw by default lokijs is an in-memory database and does not persist the data by default (so each window currently has its own list of events).

This means that we're currently losing all the events that haven't been sent once the Atom window gets closed, and since we're only sending events once Atom has been opened for 4h, we're losing any event from sessions shorter than 4h.

Using a persistance adapter on LokiJS (as mentioned in the docs) would fix this specific problem, but then we would introduce the concurrency issues when having multiple windows open, since lokijs only syncs the data from the persistance adapter on load:

An important distinction between an in-memory database like lokijs and traditional database systems is that all documents/records are kept in memory and are not loaded as needed. Persistence is therefore only for saving and restoring the state of this in-memory database.

So I see two potential paths forward (there may be more options):

  • Use a global db instance for all the windows but move away from lokijs and use a system that ensures that multiple concurrent clients are handled correctly (even by accessing IndexedDB directly). This could cause some performance regressions if we have many events, since for every read/write we would need to access the storage system (and serialize/deserialize the data).
  • Keep using lokijs as an in-memory DB (so each session has its own instance) but persist the data that has not been sent when Atom is closed. Then re-read the unsent data (and delete it from the persistant store immediately) the next time an Atom window gets opened.

The are two potential solutions:

  1. Use dexie as an IndexedDB adapter to store the events between sessions.
  2. Use the IndexedDB APIs directly (this would be more performant but slightly harder to implement).
rafeca commented 5 years ago

I've thought a bit about this and since we don't need any complex querying mechanism (we mostly need to append events when logging stuff and retrieve all the events once a day when sending them to the backend) maybe we can use a different approach for storing the metric events: the raw filesystem.

Basically, we would have different separate files: timing_events.txt, custom_events.txt and counters/<counterName>.txt (fictitious names).

With this solution, this would be the work that the different telemetry methods would have to perform:

Additionally, since we would access the filesystem asynchronously, we would need some kind of mutex to prevent multiple windows to affect each other. This mutex will be need for the incrementCounter() method and for the transition between the getters and the clearData() method (the mutex is needed in any asynchronous solution that we may want to implement).

Benefits of this approach:

Disadvantages of this approach:

Thoughts? @jasonrudolph , @nathansobo

jasonrudolph commented 5 years ago

Thoughts? @jasonrudolph , @nathansobo

@rafeca: Thanks for outlining this potential solution. My gut reaction is that this is yet more code for us to maintain, and I'm eager to avoid further growth in the surface area we're maintaining.

Do you have a feel for how much code would be involved in this file-based approach compared to a solution that uses dexie?

nathansobo commented 5 years ago

I also worry about the complexity of interacting directly with the file system. If IndexedDB offers transactions that work across windows, that seems like a more promising path to me.

rafeca commented 5 years ago

Thanks for the feedback! It's fun that both your feedback was posted before my suggestion #TimeZonesAreHard 🤣

Do you have a feel for how much code would be involved in this file-based approach compared to a solution that uses dexie?

I assume that in terms of amount of code it will be quite similar, only slightly less conventional (and probably a bit more convoluted) than just using a DB...

I'll then explore using the IndexedDB APIs directly... if they turn out to be so horrible to use as I remember they were ~7y ago, then I'll explore using lokijs (I'd like to try to avoid adding another dependency that we need to keep updated, etc).