lurenpluto commented 1 year ago

Consider adding a GC mechanism, which has been planned for a long time and is a relatively complicated mechanism, so first complete the first version based on a full scan

GC mainly includes two types of data:

Object Involves named-object-cache related modules
Chunk Involves named-data-cache/tracker/chunk-manager related modules

The relevant points are as follows:

With the global-state (root-state and local-cache) as the key point, if the object and chunk are not associated with the latest version of the state, then consider entering the gc
Scan with DEC as the granularity, complete scan at the same time, can also complete the data integrity detection of DEC
The first version is mainly based on full GC, that is, a full scan is performed every time GC, and incremental scan is not considered for the time being
During the global-state scanning process of GC, the related configuration of rmeta and object meta needs to be considered, mainly

If a path is marked as virtual, it will not enter the next level of scanning, so during the scanning process, it is necessary to apply rmeta to each level of path to determine whether the path is virtual
The default non-objectmap leaf node object scans at most one-level reference objects, but this default configuration can be modified through object meta, so during the scanning process, object meta also needs to be applied to non-objectmap objects to determine the references that need to be scanned object depth

The GC mechanism will have a generational design. Each scan will tag the associated object/chunk and design related strategies. Data beyond the specified generation will be recycled, which may involve the modification and upgrade of the database format
Other global roots other than global-state need to be considered, such as
1. Globally cached device/people (does gc be required?)
2. storageobject etc.

streetycat commented 1 year ago

Is the GroupState included, or the GroupState is a GlobalState?
I have learn the solutions for GC just a moment ago, did you mean it will use ZGC or similar.

My simple flow:

Increase the generation for every object when the GC start.
The GC thread will visit the objects mount in the state tree, and re-initialize the generation when it's visited. It's also occured when the client thread visit a object with old generation.
Remove the objects with great generation beyond the specified value.

And the clients will access the CyfsStack without stop.

lurenpluto commented 1 year ago

Is the GroupState included, or the GroupState is a GlobalState?

I have learn the solutions for GC just a moment ago, did you mean it will use ZGC or similar.

My simple flow:

Increase the generation for every object when the GC start.

The GC thread will visit the objects mount in the state tree, and re-initialize the generation when it's visited. It's also occured when the client thread visit a object with old generation.

Remove the objects with great generation beyond the specified value.

And the clients will access the CyfsStack without stop.

GroupState is a type of global-state, which is equivalent to the root-state in the Garbage Collection (GC) context. It needs to be independently scanned and marked.
Similar to ZGC, the overall idea of GC is, "In summary, GC watches which objects can be reached from our application through a chain of references and frees up the ones we can't reach." The references here mainly include global-state and some key detached objects, but the GC strategies used may differ.

Your flow is almost right. The current approach under consideration is to scan, mark, and periodically collect. The marking occurs either when the object or chunk is scanned by the GC operation or when it is used. This process needs further refinement. For example, the current mechanism for updating the "last-access-time" when a chunk is used is missing, and this should be taken into account as well.

The GC (Garbage Collection) work should ideally be executed quietly during the idle time of the cyfs-stack, making it "unnoticeable" to external users.

streetycat commented 1 year ago

I see, it's great, I have used it to clear the timeout cache(SN, and Group), but it's simple.

I learned it theoretically today.

waterflier commented 1 year ago

I think some pseudocode is needed to show and help application developers understand this process.

And I believe, such as this process is related to the implementation of RootState. It should be because before there is no concept of Root State, we have no way to accurately distinguish between Object saving and caching.

Since it is strongly related to the operation of Root State, there are two types of GC

TYPE I : post-scan based

The post-scan is also similar to the current Root State backup, starting from the root path, traversing all paths and marking the corresponding objects as "valid". Objects not marked as valid can be deleted after the scan is complete.

During the scan, the system is read-only.

TYPE II: based on real-time operation

We have APIs for adding objects and deleting objects to the Root State, and in these two APIs, the reference counts of objects are operated (can be asynchronous). When the system has no unscanned operations to add objects, it can execute the GC function to delete objects whose reference count is less than 1.

Perhaps we can further illustrate the difference between these two directions through pseudocode.

The system itself has strong and consistent synchronization and backup requirements, and it already has the implementation of TYPE I.

The implementation of TYPE II is related to the transactional nature of the Root State API. Optimization can achieve "only scan the changed part". It seems that it can reduce the time of system frozen, but there may be some insurmountable difficulties?

streetycat commented 1 year ago

I learn a word free root in this issue.

does it mean as the global roots here? How they will behave on GC?

lurenpluto commented 1 year ago

I learn a word free root in this issue.

does it mean as the global roots here? How they will behave on GC?

roots in the GC mechanism is a very key role, you can refer to the following introduction document

java-gc-roots

GC Root Definition Let's first define what GC roots are. GC root is a term used in the context of garbage collection in Java. As the name suggests, GC roots are starting points for the garbage collector processes. In general, all objects directly or indirectly referred from a GC root are not garbage collected.

In our GC system, roots include all global-state roots and a list of free objects (which may be outside of the global-state)

lurenpluto commented 1 year ago

I think some pseudocode is needed to show and help application developers understand this process.

And I believe, such as this process is related to the implementation of RootState. It should be because before there is no concept of Root State, we have no way to accurately distinguish between Object saving and caching.

Since it is strongly related to the operation of Root State, there are two types of GC

TYPE I : post-scan based

The post-scan is also similar to the current Root State backup, starting from the root path, traversing all paths and marking the corresponding objects as "valid". Objects not marked as valid can be deleted after the scan is complete.

During the scan, the system is read-only.

TYPE II: based on real-time operation

We have APIs for adding objects and deleting objects to the Root State, and in these two APIs, the reference counts of objects are operated (can be asynchronous). When the system has no unscanned operations to add objects, it can execute the GC function to delete objects whose reference count is less than 1.

Perhaps we can further illustrate the difference between these two directions through pseudocode.

The system itself has strong and consistent synchronization and backup requirements, and it already has the implementation of TYPE I.

The implementation of TYPE II is related to the transactional nature of the Root State API. Optimization can achieve "only scan the changed part". It seems that it can reduce the time of system frozen, but there may be some insurmountable difficulties?

The impact of rmeta and object meta

Another key concept in GC is rmeta and object meta, which will directly affect the result of GC.

Can configure whether the path on the global-state is virtual or not.
Can configure the reference depth of the path on the global-state
Can configure the reference depth of certain objects

So whether it is a full scan or an incremental scan based on global-state change, it can be done based on the above meta information, which is where the complexity of GC lies

GC mechanism based on generation

During GC full scan, it is not necessary to set the whole global-state to read-only mode: on the one hand, the full scan may take a long time, and setting it to read-only will have too much impact on the whole system; on the other hand, it is not necessary, because the full scan is based on the snapshot of global-state, and if changes occur during the scan, it should not affect the result of the GC. However, there is a point to note is that based on the result of one GC scan, all unrelated objects can not be cleared, but should be based on the garbage tagging and recycling of generations, there are several principles need to be observed as follows:

Each full scan, set the generation marker for all the objects/chunks referenced in that scan
Perform full scans periodically, so the generation markers of the same object/chunk may change and be recursive
Cyfs-stack upper layer needs to update the corresponding last_access_time each time the object/chunk is read
The real GC operation is to periodically clear all objects/chunks that meet the following conditions
- last_access_time is less than the specified date, such as a day or a week
- The generation marker is less than the specified generation, either by generation version (incremental) or by GC scan date

GC scan based on global-state real-time changes

From the reference-based GC principle mentioned above, the object level should not be based on reference count, but on whether the generation marker and last access time are scanned; incremental GC scan based on global-state real-time changes can be a supplement to the above full-scan GC, but only the new sub tree needs to be scanned, and there is no need to scan the removed sub tree and decrement the reference count.

buckyos / CYFS

GC mechanism based on full scan #155

The impact of rmeta and object meta

GC mechanism based on generation

GC scan based on global-state real-time changes