Open Lin-1997 opened 1 year ago
From the raft protocol defined, the snapshot operation finished when dumped to storages, I'm not sure whether jraft has controlled log apply while snapshot doing. If not, the status machine should control it.
For data consistence, the write lock is release after the snapshot operation finished.
What's more, if jraft has controlled log apply while snapshot doing, the inner write lock will no effect. Because the jraft also stop the write operation by block onApply operation.
So in my opinion, the write lock no need to changed for current information.
I'm not sure whether jraft has controlled log apply while snapshot doing. If not, the status machine should control it.
I think jraft do stop applying log while doing snapshot. here is jraft 1.3.12:
it periodically calls handleSnapshotTimeout(), and submit a doSnapshot(done: null, sync: false) task.
doSnapshot(done: null, sync: false) finally calls fsmCaller.onSnapshotSave(saveSnapshotDone).
fsmCaller.onSnapshotSave() enqueue task to fsm's taskQueue.
all tasks in the taskQueue are handled by ApplyTaskFactory by runApplyTask().
COMMITTED tasks and SNAPSHOT_SAVE tasks will be executed serially.
so I think jraft do stop applying log while doing snapshot with sync: false. And I found that only test code will call snapshot with sync: true.
What's more, if jraft has controlled log apply while snapshot doing, the inner write lock will no effect. Because the jraft also stop the write operation by block onApply operation.
I think the lock will take effect, since you submit a new task to do snapshot, and return. then the fms's runApplyTask() can continue to run other tasks.
by the way, the lock is acquired in the new thread, it may be slower than fsm's runApplyTask(), which finally calls, e.g., InstanceMetadataProcessor#onApply(), which also acquire lock to commit the new log.
so I think the snapshot may save data more than what it supposed to be
If the jraft must stop apply log while doSnapshot, whether we can remove this lock directly?
the log onApply is synced and doSnapshot will stop onApply, so the fsm will not be async modified during doSnapshot.
If the jraft must stop apply log while doSnapshot, whether we can remove this lock directly?
i think the answer is yes, jraft's exmaple also doesn't lock
see:
also in jraft-rhea:
on jraft's example, they copy the data for snapshot in the main thread, which is synced by the fsm, and then submit the snapshot task to a new thread
I see the lock usage, It seems use to make the load snapshot and save snapshot synced.
Is Jraft make sure the load snapshot and save snapshot with a synced thread?
If so, I think we can remove the lock. But also, the lock seems no effect, because jraft use only synced thread to do snapshot.
I see the lock usage, It seems use to make the load snapshot and save snapshot synced.
did you mean the lock in SnapshotExecutorImpl? I aggre with your opinions. Not only syncs between load snapshot and save snapshot but also syncs between multi load snapshot tasks, which is periodically called. finally the fsmCaller submit task (save or load snapshot) to a disruptor, where onCommitted task is also submitted to this disruptor. so i think all load snapshot tasks, save snapshot tasks, apply log tasks and readIndex tasks are executed serially.
But also, the lock seems no effect, because jraft use only synced thread to do snapshot.
I still think the lock will take effect in current implementation, and can be removed if we deep copy the data needed for snapshot in the main fsm thread before submit a new task in RaftExecutor.doSnapshot()
I think the lock will take effect, since you submit a new task to do snapshot, and return. then the fms's runApplyTask() can continue to run other tasks.
I modified my nacos implementation to lock free, including processor and snapshot. The main idea is to deep copy the data needed for snapshot. And put the serialization in the new thread to reduce the execution time in the fsm's main thread.
Use ServiceMetadataProcessor as an example, I had implement clone() for all class needed to do snapshot, e.g. Service, ServiceMetadata, ClusterMetadata, Part of the code is as follows, omitting unimportant code:
// simply remove the lock in ServiceMetadataProcessor.
interface SnapshotData extends Serializable {
void release(); // Release the data after the snapshot is complete.
}
class ServiceSnapshotData implements SnapshotData {
private final Map<Service, ServiceMetadata> serviceMetadataMap;
public void release() { serviceMetadataMap.clear(); }
}
class ServiceMetadataSnapshotOperation extends AbstractMetadataSnapshotOperation {
protected SnapshotData prepareSnapshotData() {
return new ServiceSnapshotData(metadataManager.getServiceMetadataSnapshot());
}
}
class NamingMetadataManager {
Map<Service, ServiceMetadata> getServiceMetadataSnapshot() {
ConcurrentHashMap<> snapshot = new ConcurrentHashMap<>();
for (key, value in serviceMetadataMap) {
snapshot.put(key.clone(), value.clone());
}
return snapshot;
}
}
// main changed
class AbstractSnapshotOperation {
public void onSnapshotSave() {
final SnapshotData snapshot = prepareSnapshotData(); // this is in the fsm's main thread. lock free
RaftExecutor.doSnapshot(() -> {
try {
callFinally.accept(writeSnapshot(writer, snapshot), null); // serialization insider here
} catch (Throwable t) {
callFinally.accept(false, t);
} finally {
Optional.ofNullable(snapshot).ifPresent(SnapshotData::release);
}
});
}
protected abstract SnapshotData prepareSnapshotData();
}
// main changed
abstract class AbstractMetadataSnapshotOperation extends AbstractSnapshotOperation {
protected boolean writeSnapshot(Writer writer, SnapshotData snapshot) throws IOException {
// ...
try (InputStream inputStream = new ByteArrayInputStream(serializer.serialize(snapshot))) {
DiskUtils.compressIntoZipFile(METADATA_CHILD_NAME, inputStream, outputFile, checksum);
}
// ...
}
}
On our machine with average performance, it takes about 15 seconds to save snapshot for 1 million services (serialization, compress to zip, and write to file), so the processor cannot apply log during this period, because the same lock is used.
In my lock free implementation, it takes about 2 seconds to deep copy data in the fsm's main thread, and then the processor is alive :) The remaining operations still take 15 seconds, but it doesn't block the fsm's main thread
It sounds great, You can try to submit an PR.
the changes to snapshots may be not compatible with old version, i.e., my implement dumps/loads SnapshotData. May need an intermediate version (write new version snapshot, read old version snapshop). I'm not sure how to PR
Why? I think it just remove the read write lock and do deep copy for dump snapshot.
No data structure changed and main login changed. Or your implementation has other more logic changed not described in issue?
the data should be deep copied in the fsm thread, i.e., in here, before Line 47: https://github.com/alibaba/nacos/blob/62a106bc1cfa8e3ec18aaaa1b5a3d5d638eea6de/naming/src/main/java/com/alibaba/nacos/naming/consistency/persistent/impl/AbstractSnapshotOperation.java#L46-L62
But this is an abstract class. I have two ideas, one is making a new interface, said SnapshotData. And let the final SnapshotOperations implement it:
public interface SnapshotData extends Serializable {
/**
* Release the data after the snapshot is complete.
*/
void release();
}
// public abstract class AbstractSnapshotOperation
protected abstract SnapshotData prepareSnapshotData();
public void onSnapshotSave(Writer writer, BiConsumer<Boolean, Throwable> callFinally) {
final SnapshotData snapshot = prepareSnapshotData(); // the main different
RaftExecutor.doSnapshot(() -> {
TimerContext.start(getSnapshotSaveTag());
try {
callFinally.accept(writeSnapshot(writer, snapshot), null); // the main different
} catch (Throwable t) {
Loggers.RAFT.error("[AbstractSnapshotOperation] fail to compress snapshot, path={}, file list={}.",
writer.getPath(), writer.listFiles(), t);
callFinally.accept(false, t);
} finally {
TimerContext.end(getSnapshotSaveTag(), Loggers.RAFT);
}
});
}
another choice is making the AbstractSnapshotOperation class a generic class, like this:
public abstract class AbstractSnapshotOperation<T> implements SnapshotOperation
and the final SnapshotOperations should be like this:
public abstract class AbstractMetadataSnapshotOperation<T> extends AbstractSnapshotOperation<T>
and
public class InstanceMetadataSnapshotOperation extends AbstractMetadataSnapshotOperation<ConcurrentMap<Service, ConcurrentMap<String, InstanceMetadata>>>
v2.2.0
currently locked in the beginning and unlocked when everything is done. https://github.com/alibaba/nacos/blob/62a106bc1cfa8e3ec18aaaa1b5a3d5d638eea6de/naming/src/main/java/com/alibaba/nacos/naming/consistency/persistent/impl/AbstractSnapshotOperation.java#L46-L62
in writeSnapshot, e.g. in AbstractMetadataSnapshotOperation, it serializes data and compresses it into zip file.
https://github.com/alibaba/nacos/blob/62a106bc1cfa8e3ec18aaaa1b5a3d5d638eea6de/naming/src/main/java/com/alibaba/nacos/naming/core/v2/metadata/AbstractMetadataSnapshotOperation.java#L47-L57
i think the lock can be released after serialization, not after writing the zip file.