paypal / junodb

JunoDB is PayPal's home-grown secure, consistent and highly available key-value store providing low, single digit millisecond, latency at any scale.
Apache License 2.0
2.56k stars 163 forks source link

Refactors `EtcdReader.readNodesShards` method for More Efficient Zone Initialization and Node Shards Assignment #153

Open KhanSufiyanMirza opened 11 months ago

KhanSufiyanMirza commented 11 months ago

Issue Overview:

In the etcd package and file etcdreader.go, the readNodesShards method is responsible for reading Nodes Shards assignment through the etcd reader. During this process, it calls cluster.NewZoneFromConfig to initialize zones if c.Zones[zoneid] == nil. However, there is an opportunity to improve efficiency and avoid redundancy in zone and its nodes initialization. It's important to note that cluster.NewZoneFromConfig not only initializes the zone but also executes zone.initShardsAsssignment(numZones, numShards) to populate nodes. This behavior conflicts with the logic in readNodesShards, which manually overrides node data using c.Zones[zoneid].Nodes[nodeid].StringToNode.

Proposed Solution:

To enhance code efficiency and eliminate redundancy, we propose refactoring the readNodesShards function. Specifically, we suggest replacing the usage of cluster.NewZoneFromConfig with the new cluster.NewZone function. This new function will initialize zones without populating the Nodes field, leaving that task to be performed later in the code when it is needed. By making this change, we can avoid overwriting node data during zone initialization and improve overall efficiency.

Impact:

This change will optimize the readNodesShards function and reduce redundant population of the Nodes field during zone initialization. It is expected to have a positive impact on performance. This optimization can lead to more efficient resource utilization and better overall system performance.

Related Code:

Here's the proposed change to the readNodesShards function:

// Replace this line
c.Zones[zoneid] = cluster.NewZoneFromConfig(uint32(zoneid), uint32(nodeid+1), c.NumZones, c.NumShards)

// With this line
c.Zones[zoneid] = cluster.NewZone(uint32(zoneid), uint32(nodeid+1))

And also in cluster package

// NewZone creates a new zone with the specified attributes.
func NewZone(zoneid uint32, numNodes uint32) *Zone {
    zone := newZone(zoneid, numNodes)
    return &zone
}

// newZone initializes and returns a new zone with the specified attributes.
func newZone(zoneid uint32, numNodes uint32) Zone {
    return Zone{
        Zoneid:   zoneid,
        NumNodes: numNodes,
        Nodes:    make([]Node, 1, numNodes),
    }
}

Additional Context:

This change is suggested as part of ongoing efforts to optimize and improve the codebase of the Juno project. It aims to make zone initialization and Nodes Shards assignment more efficient while maintaining code correctness.