Table ownership differences can leave folsom inconsistent

boundary / folsom

Expose Erlang Events and Metrics

Apache License 2.0

585 stars 166 forks source link

Table ownership differences can leave folsom inconsistent #30

Open russelldb opened 12 years ago

russelldb commented 12 years ago

The history metric creates a new ets table when a new metric is created. The owner of that table is the process that called folsom_metrics:new_history(Name). However, the folsom table is owned by the folsom supervisor. In the case that the process that owns the history exits the history metric table itself crashes, but the entry in the folsom metrics table remains.

Folsom is then in an inconsistent state. Using folsom_metrics_histogram_ets to create (and therefore own) the table would probably help. Ideally folsom should have a single process that owns all ets tables so that there is consistency (a crash takes them all away, they're insulated from calling process crashes.) Better still would be to implement something like the strategy in this article http://steve.vinoski.net/blog/2011/03/23/dont-lose-your-ets-tables/

I'm raising this as a request for comments before I factor such a strategy into folsom. Opinions?

joewilliams commented 12 years ago

Steve's post seems like a good setup and makes sense to me. Yet another item for the folsom to do list. I'll have some time soon if you would like to collaborate on this and the race condition issues.

russelldb commented 12 years ago

Yes. That would be good. When we get 1.2 out I've got some time to work on folsom.

mmzeeman commented 11 years ago

At zotonic we recently ran into this issue too. Indeed all tables of folsom should be owned by one process.

Automatic bookkeeping of metrics can be implemented by monitoring the process which creates the new metric. When that process dies folsom can do the bookkeeping without any problem.

The crash we get from time to time.

2013-03-20 17:02:52.468 [error] emulator Error in process <0.32714.199> on node 'zotonic001@Lamma' with exit value:
{badarg,[{ets,delete,[26940620904],[]},{folsom_sample_exdec,delete_and_rescale,4,
[{file,"src/folsom_sample_exdec.erl"},{line,122}]},{folsom_sample_exdec,rescale,5,[{file,"src/folsom_sample_exdec.erl"},
{line,106}]},{folsom_sample_exdec..

joewilliams commented 10 years ago

Anyone have an interest in tackling this one?

sebmaynard commented 10 years ago

Just ran into this one myself! think I'll hold off using histories for now. I'm going to just stick the few things I wanted it for in a queue in a gen_server, but not ideal...

Looking at the table viewer, it seems the Tids are changing, but the histories index isn't getting updated at the same time.

joewilliams commented 8 years ago

Test message, ignore me.

joewilliams commented 8 years ago

Folsom has moved, please resubmit your issue at https://github.com/folsom-project Thanks!