While enjoying Pulse as a way to view into my Laravel apps I am also evaluating the Pulse data storage as a generic, versatile storage for Time Series Data.
Time Series Data could be e.g. measurements of IoT Sensors (temperature, humidity,...) but also anything else you want to measure over time and analyse.
For validation I have performed a scalability and performance test for 97M recorded events spread across a virtual timeframe of 4 weeks.
Nice! Versatile key value approach
Due to the key_hash approach for all queries, the value for key does not influence the indexing / performance.
So the key can be used as a path to a sensor e.g. location72.sectionB.sensor42 for recording, retrieval and analytics.
Test Scenario
For the test scenario I have seeded Pulse for 1000 keys and 5 types using the command attached at the bottom.
The entries are randomly spread across 4 weeks to generate a large number of aggregates.
All tests done local on MBP M2, MySql 8. After running the command a couple of times the database looks like this:
table
rows
pulse_aggregates
2.3M
pulse_entries
97M
Pulse::Graph
following some tests performed in tinkerwell w/o output of the result.
The query performance is always excellent (some ms) due to the index structure.
The allover performance of the default implementation is depending on the number of types and the timeframe.
However above 3 types it runs into "out-of-memory" (128MB) as all results for all keys are loaded into the result collection.
The performance of the method of #344 shows a significant improvement in time and memory consumption as only the relevant data for the given keys is extracted into the result collection.
Conclusion
With some minor changes the data storage of Pulse could be used as a powerful storage for Time Series Data adding to the great Laravel stack!
Some ideas
adding more analytics features
configurable periods for aggegrates
configurable trimming periods
data retention to archive historic data
Your thoughts?
Command used for seeding Data
namespace App\Console\Commands;
use Illuminate\Console\Command;
class seedData extends Command
{
/**
* The name and signature of the console command.
*
* @var string
*/
protected $signature = 'app:seed-data';
/**
* The console command description.
*
* @var string
*/
protected $description = 'Pulse Data Seeder';
/**
* Execute the console command.
*/
public function handle()
{
$this->numberOfKeys = 1000;
$this->numberOfTypes = 5;
$this->numberOfEvents = 10000;
$this->dateRange = 4 * 7 * 24 * 60 * 60; // spread across 4 weeks
$keys = collect(range(1, $this->numberOfKeys))->map(function ($number) {
return 'key:' . $number;
});
$types = collect(range(1, $this->numberOfTypes))->map(function ($number) {
return 'type:' . $number;
});
$keys->each(function ($key) use ($types) {
for ($i = 1; $i <= $this->numberOfEvents; $i++) {
$types->each(function($type) use ($key) {
\Laravel\Pulse\Facades\Pulse::record($type, key: $key, value: rand(-10000, 10000), timestamp: time()-rand(0, $this->dateRange))->avg()->min()->max()->count();
});
}
});
}
}
While enjoying Pulse as a way to view into my Laravel apps I am also evaluating the Pulse data storage as a generic, versatile storage for Time Series Data.
Time Series Data could be e.g. measurements of IoT Sensors (temperature, humidity,...) but also anything else you want to measure over time and analyse.
For validation I have performed a scalability and performance test for 97M recorded events spread across a virtual timeframe of 4 weeks.
Nice! Versatile key value approach
Due to the
key_hash
approach for all queries, the value forkey
does not influence the indexing / performance.So the
key
can be used as a path to a sensor e.g.location72.sectionB.sensor42
for recording, retrieval and analytics.Test Scenario
For the test scenario I have seeded Pulse for 1000 keys and 5 types using the command attached at the bottom.
The entries are randomly spread across 4 weeks to generate a large number of aggregates.
All tests done local on MBP M2, MySql 8. After running the command a couple of times the database looks like this:
Pulse::Graph
following some tests performed in tinkerwell w/o output of the result.
graph()
- default methodgraph(1 key)*
- #344, supporting keysThe method is called as following:
Pulse::graph(["type:1", "type:3"], "max", CarbonInterval::hours(168));
The modified method is called as following:
Pulse::graph(["type:1", "type:3"], "max", CarbonInterval::hours($hrs), keys: ["key:142", "key:1", "key:789"]]);
Test results
The query performance is always excellent (some ms) due to the index structure.
The allover performance of the default implementation is depending on the number of types and the timeframe. However above 3 types it runs into "out-of-memory" (128MB) as all results for all keys are loaded into the result collection.
Improvements of #344
The performance of the method of #344 shows a significant improvement in time and memory consumption as only the relevant data for the given keys is extracted into the result collection.
Conclusion
With some minor changes the data storage of Pulse could be used as a powerful storage for Time Series Data adding to the great Laravel stack!
Some ideas
Your thoughts?
Command used for seeding Data