Closed akiradeveloper closed 8 years ago
If the catastrophe occurs,
The subsequent tasks should be killed because we shouldn't ack to the differed flush requests. Otherwise the application believes that the lacking data is persistent.
Retrying the flush job is too difficult. Since the possibility of this case is close to zero, I don't want to do much work for this.
If we have only two rambuffers and always wait for the another rambuffer is flushed before queuing itself? This way, we can keep only a single flush job in the wq and practically no performance regression.
And if the currently executing job is terminated, requeue it?
I will add an BUG_ON assertion into flush_proc in 2.2.2
If this happens, it causes #111. I want to test if the case happens in @onlyjob 's environment.
I have a fix for this issue in my mind. But I don't know how this happen (memory is broken and accessing the memory kills the thread?)
I am not sure if this is truly to be considered.
when rambuf becomes full, a flush_job is made and queued into flush_wq. It's a singlethread_workqueue that is actually an ordered_workqueue that is, as described, executes the tasks one by one in queued order.
But, what if a task is terminated for some really unknown reason? The behavior would be
If it's the first one, the caching device may corrupts because of the lacking side-effects.
I think we need at least an assertion at the beginning of the flush_proc so the
flush_job->id = last_flushed_segment_id + 1
is satisfied. Otherwise should kill the worker.