Description

With the aggregation of the active_nodes table, the crawler dumps all the registered nodes into a new row every 12h. However, this dump or snapshot is done 12h after the startup of the crawler. The idea behind this was to avoid having 0 active nodes tracked on the table when it was getting started. The side effect of this decision is that whenever the crawler is stopped, there won't be a snapshot in (crawler_shutdown_time - time_between_last_snapshot) + 12h, which might not be ideal for tracking the distribution with enough resolution.

Possible solution

The simplest solution is to make a new snapshot right when the crawler is started, so at least there is track of the distribution of when the crawler was restarted

migalabs / armiarma

Make snapshot of active nodes on restart #59

Description

Possible solution