Open kempniu opened 6 years ago
Hey, @kempniu, I'm not much involved in fstrm development these days, but I wrote most of the original code. Can you point me to those "inelegant workarounds" for dnstap file rolling that you refer to in BIND? Just curious how libfstrm is currently being used in BIND.
The following two functions come to mind specifically:
Whenever it is determined that the dnstap output file is to be rolled, named
basically stops doing everything else, rolls the output file itself, creates a new fstrm writer thread and increments a static variable which is used by each producer thread (i.e. BIND worker threads) to determine whether the writer thread has been closed since the last time this specific producer thread queued an item to a queue; if so, a new queue is retrieved from the writer thread and subsequently used by the producer thread.
Meanwhile, IIUC, if fstrm supported size-based file rolling "natively", BIND worker threads could just shove data into a queue and let fstrm handle output file rolling. A lot of the complexity could then be removed from BIND, while (probably) not nearly as much code would need to be added to fstrm. But given the doubts surrounding size-based rolling itself, the question is whether more work should be put into supporting it properly (dropping support for size-based rolling altogether is also being considered). Hence me soliciting your opinion :)
i think fstrm should support the ways people want to use it.
@kempniu Yikes, BIND stops doing everything, like you have a giant lock on the whole server? It doesn't respond to queries while the files are being rolled?
Since the fstrm iothr
interface only deals with a single writer and there's only a single dedicated I/O thread and we already keep track of output bytes and the current time it would probably be pretty easy to implement timed and sized output rolling.
What if you had something like this in the libfstrm API?
typedef fstrm_res (*fstrm_iothr_writer_replace_callback)(struct fstrm_writer *old_writer,
struct fstrm_writer **new_writer,
void **user);
fstrm_res
fstrm_iothr_options_set_timed_writer_replace_callback(struct fstrm_iothr_options *opt,
unsigned replace_interval,
fstrm_iothr_writer_replace_callback,
void *user);
fstrm_res
fstrm_iothr_options_set_sized_writer_replace_callback(struct fstrm_iothr_options *opt,
uint64_t replace_nbytes,
fstrm_iothr_writer_replace_callback,
void *user);
With the semantics being that you could provide two callbacks (each callback being a fstrm_iothr_writer_replace_callback
function pointer plus user
pointer) to be called every replace_interval
seconds (timed rolls), or every replace_nbytes
of output data written (sized rolls), and the fstrm I/O thread would stop itself and call your provided callback, and let you close the old writer and provide a new writer?
What if you had something like this in the libfstrm API?
+1.
Yikes, BIND stops doing everything, like you have a giant lock on the whole server? It doesn't respond to queries while the files are being rolled?
Correct, that is what employing isc_task_beginexclusive()
results in. There are only a few uses of it outside of server reconfiguration code. That is the price named
pays for making it possible to replace an immutable writer residing inside an opaque data structure (struct fstrm_iothr
) while multiple producer threads are queueing stuff concurrently. Note that this is not a criticism of fstrm's design, just an explanation of why named
is doing what it is doing: task-exclusive mode is what is used for guaranteeing that some worker thread queueing dnstap data (and thus potentially causing the output file to be flushed by fstrm's I/O thread) will not race with another worker thread concurrently reopening the writer (which leads to Bad Things™).
What if you had something like this in the libfstrm API?
The semantics you proposed seem very reasonable to me at this point. Anything allowing safe writer replacement inside an existing fstrm I/O thread should work.
There is one more thing. BIND also allows the dnstap output file to be rolled "at will", using rndc dnstap -reopen
. Could the family of functions you proposed perhaps be extended with one more similar function, e.g. fstrm_iothr_options_set_adhoc_writer_replace_callback()
? While timed and sized rolls would be triggered by fstrm itself, ad hoc rolls would need to be triggered externally. Yet another function for something like "hey, fstrm, hold my b^Wframes while I replace the writer!" would be needed. Would that be feasible?
@kempniu I think BIND should figure out how to replace a pointer being used by multiple threads without needing to block the entire server, e.g. liburcu makes this pretty trivial. If you fixed BIND to not have a giant lock it doesn't appear that there would be a need to complicate the libfstrm API to do all this.
BIND allows dnstap output files to be rolled when their size exceeds a configured limit. In order to make that possible, though, a set of inelegant workarounds are required in fstrm-related code. The only way to support such a feature in a clean manner seems to be to get it implemented in fstrm itself. Such an addition could potentially also be welcomed by other fstrm users.
This issue is not really a feature request, but rather a question: would support for size-based output file rolling be considered a welcome addition to fstrm's source code? I am aware that operational usefulness of size-based rolling can be questioned, but apparently there are people interested in using it. What is your opinion?