onSnapshotSave cloud release lock earlier?

Lin-1997 commented 1 year ago

v2.2.0

currently locked in the beginning and unlocked when everything is done. https://github.com/alibaba/nacos/blob/62a106bc1cfa8e3ec18aaaa1b5a3d5d638eea6de/naming/src/main/java/com/alibaba/nacos/naming/consistency/persistent/impl/AbstractSnapshotOperation.java#L46-L62

in writeSnapshot, e.g. in AbstractMetadataSnapshotOperation, it serializes data and compresses it into zip file.

https://github.com/alibaba/nacos/blob/62a106bc1cfa8e3ec18aaaa1b5a3d5d638eea6de/naming/src/main/java/com/alibaba/nacos/naming/core/v2/metadata/AbstractMetadataSnapshotOperation.java#L47-L57

i think the lock can be released after serialization, not after writing the zip file.

KomachiSion commented 1 year ago

From the raft protocol defined, the snapshot operation finished when dumped to storages, I'm not sure whether jraft has controlled log apply while snapshot doing. If not, the status machine should control it.

For data consistence, the write lock is release after the snapshot operation finished.

KomachiSion commented 1 year ago

What's more, if jraft has controlled log apply while snapshot doing, the inner write lock will no effect. Because the jraft also stop the write operation by block onApply operation.

KomachiSion commented 1 year ago

So in my opinion, the write lock no need to changed for current information.

Lin-1997 commented 1 year ago

I'm not sure whether jraft has controlled log apply while snapshot doing. If not, the status machine should control it.

I think jraft do stop applying log while doing snapshot. here is jraft 1.3.12:

https://github.com/sofastack/sofa-jraft/blob/2bc70de21d0237b495e732b714c6a069e9be94f6/jraft-core/src/main/java/com/alipay/sofa/jraft/core/NodeImpl.java#L960-L985

it periodically calls handleSnapshotTimeout(), and submit a doSnapshot(done: null, sync: false) task.

https://github.com/sofastack/sofa-jraft/blob/2bc70de21d0237b495e732b714c6a069e9be94f6/jraft-core/src/main/java/com/alipay/sofa/jraft/core/NodeImpl.java#L607-L618

doSnapshot(done: null, sync: false) finally calls fsmCaller.onSnapshotSave(saveSnapshotDone).

https://github.com/sofastack/sofa-jraft/blob/2bc70de21d0237b495e732b714c6a069e9be94f6/jraft-core/src/main/java/com/alipay/sofa/jraft/storage/snapshot/SnapshotExecutorImpl.java#L381-L389

fsmCaller.onSnapshotSave() enqueue task to fsm's taskQueue.

https://github.com/sofastack/sofa-jraft/blob/2bc70de21d0237b495e732b714c6a069e9be94f6/jraft-core/src/main/java/com/alipay/sofa/jraft/core/FSMCallerImpl.java#L284-L289

all tasks in the taskQueue are handled by ApplyTaskFactory by runApplyTask().

https://github.com/sofastack/sofa-jraft/blob/2bc70de21d0237b495e732b714c6a069e9be94f6/jraft-core/src/main/java/com/alipay/sofa/jraft/core/FSMCallerImpl.java#L200-L206

https://github.com/sofastack/sofa-jraft/blob/2bc70de21d0237b495e732b714c6a069e9be94f6/jraft-core/src/main/java/com/alipay/sofa/jraft/core/FSMCallerImpl.java#L391

COMMITTED tasks and SNAPSHOT_SAVE tasks will be executed serially.

so I think jraft do stop applying log while doing snapshot with sync: false. And I found that only test code will call snapshot with sync: true.

Lin-1997 commented 1 year ago

What's more, if jraft has controlled log apply while snapshot doing, the inner write lock will no effect. Because the jraft also stop the write operation by block onApply operation.

I think the lock will take effect, since you submit a new task to do snapshot, and return. then the fms's runApplyTask() can continue to run other tasks.

https://github.com/alibaba/nacos/blob/70b96495bdc6eb6b40abade795c658605efe01bf/naming/src/main/java/com/alibaba/nacos/naming/consistency/persistent/impl/AbstractSnapshotOperation.java#L46-L62

by the way, the lock is acquired in the new thread, it may be slower than fsm's runApplyTask(), which finally calls, e.g., InstanceMetadataProcessor#onApply(), which also acquire lock to commit the new log.

so I think the snapshot may save data more than what it supposed to be

KomachiSion commented 1 year ago

If the jraft must stop apply log while doSnapshot, whether we can remove this lock directly?

the log onApply is synced and doSnapshot will stop onApply, so the fsm will not be async modified during doSnapshot.

Lin-1997 commented 1 year ago

If the jraft must stop apply log while doSnapshot, whether we can remove this lock directly?

i think the answer is yes, jraft's exmaple also doesn't lock

see:

https://github.com/sofastack/sofa-jraft/blob/26977b062d1a514e0f1590224b0405c5e93599a6/jraft-example/src/main/java/com/alipay/sofa/jraft/example/counter/CounterStateMachine.java#L85-L131

https://github.com/sofastack/sofa-jraft/blob/26977b062d1a514e0f1590224b0405c5e93599a6/jraft-example/src/main/java/com/alipay/sofa/jraft/example/counter/CounterStateMachine.java#L134-L148

https://github.com/sofastack/sofa-jraft/blob/26977b062d1a514e0f1590224b0405c5e93599a6/jraft-example/src/main/java/com/alipay/sofa/jraft/example/counter/CounterStateMachine.java#L156-L174

also in jraft-rhea:

https://github.com/sofastack/sofa-jraft/blob/26977b062d1a514e0f1590224b0405c5e93599a6/jraft-rheakv/rheakv-core/src/main/java/com/alipay/sofa/jraft/rhea/storage/KVStoreStateMachine.java#L83-L134

https://github.com/sofastack/sofa-jraft/blob/26977b062d1a514e0f1590224b0405c5e93599a6/jraft-rheakv/rheakv-core/src/main/java/com/alipay/sofa/jraft/rhea/storage/KVStoreStateMachine.java#L257-L269

Lin-1997 commented 1 year ago

on jraft's example, they copy the data for snapshot in the main thread, which is synced by the fsm, and then submit the snapshot task to a new thread

KomachiSion commented 1 year ago

I see the lock usage, It seems use to make the load snapshot and save snapshot synced.

Is Jraft make sure the load snapshot and save snapshot with a synced thread?

If so, I think we can remove the lock. But also, the lock seems no effect, because jraft use only synced thread to do snapshot.

Lin-1997 commented 1 year ago

I see the lock usage, It seems use to make the load snapshot and save snapshot synced.

did you mean the lock in SnapshotExecutorImpl? I aggre with your opinions. Not only syncs between load snapshot and save snapshot but also syncs between multi load snapshot tasks, which is periodically called. finally the fsmCaller submit task (save or load snapshot) to a disruptor, where onCommitted task is also submitted to this disruptor. so i think all load snapshot tasks, save snapshot tasks, apply log tasks and readIndex tasks are executed serially.

But also, the lock seems no effect, because jraft use only synced thread to do snapshot.

I still think the lock will take effect in current implementation, and can be removed if we deep copy the data needed for snapshot in the main fsm thread before submit a new task in RaftExecutor.doSnapshot()

I think the lock will take effect, since you submit a new task to do snapshot, and return. then the fms's runApplyTask() can continue to run other tasks.

Lin-1997 commented 1 year ago

I modified my nacos implementation to lock free, including processor and snapshot. The main idea is to deep copy the data needed for snapshot. And put the serialization in the new thread to reduce the execution time in the fsm's main thread.

Use ServiceMetadataProcessor as an example, I had implement clone() for all class needed to do snapshot, e.g. Service, ServiceMetadata, ClusterMetadata, Part of the code is as follows, omitting unimportant code:

// simply remove the lock in ServiceMetadataProcessor.

interface SnapshotData extends Serializable {
    void release(); // Release the data after the snapshot is complete.
}

class ServiceSnapshotData implements SnapshotData {
    private final Map<Service, ServiceMetadata> serviceMetadataMap;
    public void release() { serviceMetadataMap.clear(); }
}

class ServiceMetadataSnapshotOperation extends AbstractMetadataSnapshotOperation {
    protected SnapshotData prepareSnapshotData() {
        return new ServiceSnapshotData(metadataManager.getServiceMetadataSnapshot());
    }
}

class NamingMetadataManager {
    Map<Service, ServiceMetadata> getServiceMetadataSnapshot() {
        ConcurrentHashMap<> snapshot = new ConcurrentHashMap<>();
        for (key, value in serviceMetadataMap) {
            snapshot.put(key.clone(), value.clone());
        }
        return snapshot;
    }
}

// main changed
class AbstractSnapshotOperation {
    public void onSnapshotSave() {
        final SnapshotData snapshot = prepareSnapshotData(); // this is in the fsm's main thread. lock free
        RaftExecutor.doSnapshot(() -> {
            try {
                callFinally.accept(writeSnapshot(writer, snapshot), null); // serialization insider here
            } catch (Throwable t) {
                callFinally.accept(false, t);
            } finally {
                Optional.ofNullable(snapshot).ifPresent(SnapshotData::release);
            }
        });
    }

    protected abstract SnapshotData prepareSnapshotData();
}

// main changed
abstract class AbstractMetadataSnapshotOperation extends AbstractSnapshotOperation {
    protected boolean writeSnapshot(Writer writer, SnapshotData snapshot) throws IOException {
        // ...
        try (InputStream inputStream = new ByteArrayInputStream(serializer.serialize(snapshot))) {
            DiskUtils.compressIntoZipFile(METADATA_CHILD_NAME, inputStream, outputFile, checksum);
        }
        // ...
    }
}

Lin-1997 commented 1 year ago

On our machine with average performance, it takes about 15 seconds to save snapshot for 1 million services (serialization, compress to zip, and write to file), so the processor cannot apply log during this period, because the same lock is used.

In my lock free implementation, it takes about 2 seconds to deep copy data in the fsm's main thread, and then the processor is alive :) The remaining operations still take 15 seconds, but it doesn't block the fsm's main thread

KomachiSion commented 1 year ago

It sounds great, You can try to submit an PR.

Lin-1997 commented 1 year ago

the changes to snapshots may be not compatible with old version, i.e., my implement dumps/loads SnapshotData. May need an intermediate version (write new version snapshot, read old version snapshop). I'm not sure how to PR

KomachiSion commented 1 year ago

Why? I think it just remove the read write lock and do deep copy for dump snapshot.

No data structure changed and main login changed. Or your implementation has other more logic changed not described in issue?

Lin-1997 commented 1 year ago

the data should be deep copied in the fsm thread, i.e., in here, before Line 47: https://github.com/alibaba/nacos/blob/62a106bc1cfa8e3ec18aaaa1b5a3d5d638eea6de/naming/src/main/java/com/alibaba/nacos/naming/consistency/persistent/impl/AbstractSnapshotOperation.java#L46-L62

But this is an abstract class. I have two ideas, one is making a new interface, said SnapshotData. And let the final SnapshotOperations implement it:

public interface SnapshotData extends Serializable {
    /**
     * Release the data after the snapshot is complete.
     */
    void release();
}

//  public abstract class AbstractSnapshotOperation

protected abstract SnapshotData prepareSnapshotData();

public void onSnapshotSave(Writer writer, BiConsumer<Boolean, Throwable> callFinally) {
    final SnapshotData snapshot = prepareSnapshotData(); // the main different 
    RaftExecutor.doSnapshot(() -> {
            TimerContext.start(getSnapshotSaveTag());
            try {
                callFinally.accept(writeSnapshot(writer, snapshot), null); // the main different 
            } catch (Throwable t) {
                Loggers.RAFT.error("[AbstractSnapshotOperation] fail to compress snapshot, path={}, file list={}.",
                    writer.getPath(), writer.listFiles(), t);
                callFinally.accept(false, t);
            } finally {
                TimerContext.end(getSnapshotSaveTag(), Loggers.RAFT);
            }
        });
    }

another choice is making the AbstractSnapshotOperation class a generic class, like this: public abstract class AbstractSnapshotOperation<T> implements SnapshotOperation and the final SnapshotOperations should be like this: public abstract class AbstractMetadataSnapshotOperation<T> extends AbstractSnapshotOperation<T> and public class InstanceMetadataSnapshotOperation extends AbstractMetadataSnapshotOperation<ConcurrentMap<Service, ConcurrentMap<String, InstanceMetadata>>>

alibaba / nacos

onSnapshotSave cloud release lock earlier? #10496