Closed rafeca closed 5 years ago
I've thought a bit about this and since we don't need any complex querying mechanism (we mostly need to append events when logging stuff and retrieve all the events once a day when sending them to the backend) maybe we can use a different approach for storing the metric events: the raw filesystem.
Basically, we would have different separate files: timing_events.txt
, custom_events.txt
and counters/<counterName>.txt
(fictitious names).
timing_events.txt
/custom_events.txt
files would have a separate line for each event with all its metadata, which would get serialized either using JSON or v8.serialize()
.incrementCounter()
.With this solution, this would be the work that the different telemetry
methods would have to perform:
addCustomEvent()
: Serialize the event and append it to the custom_events.txt
file.addTiming()
: Serialize the event and append it to the custom_timings.txt
file.incrementCounter()
: Try to read the counters/counterName.txt
file. If it exists, increment the correspondant counter, and store the value back to the file. If it doesn't create that file with value 1
.getCustomEvents()
: Read the custom_events.txt
file, split it by lines and deserialize each line.getTimings()
: Read the timing_events.txt
file, split it by lines and deserialize each line.getCounters()
: Find all the counter files (by listing the counters/
folder), read and deserialize each of them.clearData()
: Delete custom_events.txt
, timing_events.txt
, and the contents of the counters/
folder.Additionally, since we would access the filesystem asynchronously, we would need some kind of mutex
to prevent multiple windows to affect each other. This mutex will be need for the incrementCounter()
method and for the transition between the getters and the clearData()
method (the mutex is needed in any asynchronous solution that we may want to implement).
Benefits of this approach:
addCustomEvent()
/ addTiming()
/ incrementCounter()
do the minimum work possible.Disadvantages of this approach:
Thoughts? @jasonrudolph , @nathansobo
Thoughts? @jasonrudolph , @nathansobo
@rafeca: Thanks for outlining this potential solution. My gut reaction is that this is yet more code for us to maintain, and I'm eager to avoid further growth in the surface area we're maintaining.
Do you have a feel for how much code would be involved in this file-based approach compared to a solution that uses dexie?
I also worry about the complexity of interacting directly with the file system. If IndexedDB offers transactions that work across windows, that seems like a more promising path to me.
Thanks for the feedback! It's fun that both your feedback was posted before my suggestion #TimeZonesAreHard 🤣
Do you have a feel for how much code would be involved in this file-based approach compared to a solution that uses dexie?
I assume that in terms of amount of code it will be quite similar, only slightly less conventional (and probably a bit more convoluted) than just using a DB...
I'll then explore using the IndexedDB APIs directly... if they turn out to be so horrible to use as I remember they were ~7y ago, then I'll explore using lokijs
(I'd like to try to avoid adding another dependency that we need to keep updated, etc).
From https://github.com/atom/telemetry/pull/25, we need to move away from
lokijs
and change the storage for events not yet sent on thetelemetry
package:The are two potential solutions:
dexie
as an IndexedDB adapter to store the events between sessions.IndexedDB
APIs directly (this would be more performant but slightly harder to implement).