Closed viveksahu26 closed 1 month ago
@viveksahu26 same here is this ready for review
@viveksahu26 same here is this ready for review
Yeah, this one too.
@viveksahu26 this appears to be too complex. Lets discuss this
Yeah sure @riteshnoronha !! So, basically originally we were fetching records from database in 3 ways:
SBOM_SPEC
, SBOM_SPEC_VERSION
, SBOM_TIMESTAMP
, etc)SPDX Elements
, SBOM Format
SBOM Build Information
, etc)For example, to get all records
for particular Key
, we had to loop over all records
and check each record whether it contain that Key
or not, If contains then append it to final list of records containing that Key
and return that list as the loop ends. Similarly we perform operation to get records for Key
as well similalrly for Key
and ID
. So, if we conclude it, we can see that at the end we return all the values(i.e. records
) for a particular Key
.
In the new changes, we have simplified it to Map data structure. As it has the functionality to store all values for a particular key. So, earlier when we were adding any record we were simply appending to the list of records.
func (d *db) addRecord(r *record) {
d.records = append(d.records, r)
}
But, when we are adding any record, we are adding it in 3 ways:
Key
and Map value as a Record
.Id
and Map value as a Record
.Key
& Id
and Map value as a Record
.// addRecord adds a single record to the database
func (d *db) addRecord(r *record) {
d.mu.Lock()
defer d.mu.Unlock()
// store record using a key
d.keyRecords[r.check_key] = append(d.keyRecords[r.check_key], r)
// store record using a id
d.idRecords[r.id] = append(d.idRecords[r.id], r)
if d.idRecords[r.id] == nil {
d.keyIdRecords[r.id] = make(map[int][]*record)
}
// store record using a key and id
d.keyIdRecords[r.id][r.check_key] = append(d.keyIdRecords[r.id][r.check_key], r)
d.allIds[r.id] = struct{}{}
}
We are adding in this way, so that at the time of fetching these records from any any type of key, whether it be Key
, or ID
or Key and ID
, we don't have to loop over records and check each records whether it contains that key or not.
Now coming to the concept of Mutex
that we are using every time as below:
Mutex
: ensures that only one thread or one process or one goroutine can access the critical section of code or shared resource at any given time.
NOTE
: right now it struck me at the time of writing, we can remove mutex because we are not using goroutines.
Here we are using ReadWrite
Mutex type.
mu sync.RWMutex
Each time when write operation is being done to db, then we want to make sure that no other process or threads or goroutine is trying to write. That's why we are locking the operation. And once written it is unlocked. And then next threads or goroutine waiting in a queue will perform the writing operation.
d.mu.Lock()
defer d.mu.Unlock()
and
Below Mutex is used in case when multiple process or threads or goroutine is trying to read the database, in that
d.mu.RLock()
defer d.mu.RUnlock()
Let me know if you have any question ?
We dont use go-routines currently. Lets keep it even simpler for this release.
Removed Mutex stuffs or any related to goroutines.
Closes issue: #265 This pull request introduces an optimized version of the db package to improve the performance of our database operations. As well as, benchmark tests been added to measure the performance and efficiency of the new implementation.
Changes modifies are:
check_key
.id
.check_key
andid
.Storing records in map is highly optimized version at the retrieval time, which result into O(1) time complexity. Whereas, original db uses loops to retrieve records and that results into O(n) time complexity.
Use of Mutex concept:
Apart from that added a benchmark test, to measure the performance of key database operations::
check_key
.id
.check_key
andid
.~Implementation of goroutines for parallel tasking:~ ~- Goroutines are functions or methods that run concurrently with other functions or methods. They are created using the go keyword. Whereas channels are used to send and receive values, allowing goroutines to synchronize their operations.~ ~
sync.RWMutex
will make ensure that concurrent access to the database is handled correctly.~