vmware-archive / go-pmem-transaction

Golang library for using persistent memory
Other
29 stars 5 forks source link

API changes #11

Open jerrinsg opened 5 years ago

jerrinsg commented 5 years ago

This commit introduces API changes based on feedback received during RADIO.

1) Init

Init() earlier returned a bool (say firstInit) to indicate whether this was a first time initialization or not. I have changed this to return an error instead.

The firstInit bool was not helpful as even if this is not a first time initialization, application have to ensure that the data inside the root object is valid (through a magic number, member-wise data comparison, etc.). Any error during persistent memory initialization earlier led to an application crash. Changing the return value to an error helps to handle errors gracefully by returning a useful error message to the application.

2) Make and New

Earlier we had two APIs to get and set objects in persistent memory - New and Get for basic objects, Make and GetSlice for slices. This change combines the get and set APIs into a single API - if the named object exists, a pointer (or slice) to that object is returned. Otherwise a new object (new slice) is created.

Following shows an example application written both before and after this API change.

Preamble:

package main

import (
    "github.com/vmware/go-pmem-transaction/pmem"
    "github.com/vmware/go-pmem-transaction/transaction"
)

const (
    DATA = 0xABCD
)

// A function that populates the contents of the root object transactionally
func populateRoot(ptr *int) {
    tx := transaction.NewUndoTx()
    tx.Begin()
    tx.Log(ptr)
    *ptr = DATA
    tx.End()
    transaction.Release(tx)
}

Before:

func main() {
    firstInit := pmem.Init("/mnt/ext4-pmem0/database")
    var ptr *int
    if firstInit {
        // Create a new named object called dbRoot and point it to ptr
        ptr = (*int)(pmem.New("root", ptr))
        populateRoot(ptr)
    } else {
        // Retrieve the named object dbRoot
        ptr = (*int)(pmem.Get("root", ptr))
        if *ptr != DATA {
            // An object named dbRoot exists, but its initialization did not
            // complete previously.
            populateRoot(ptr)
        }
    }
    println("Data is ", *ptr)
}

After:

func main() {
    err := pmem.Init("/mnt/ext4-pmem0/database")
    if err != nil {
        log.Fatal(err)
    }
    var ptr *int
    // Create/retrieve the named object called dbRoot and point it to ptr
    ptr = (*int)(pmem.New("root", ptr))
    if *ptr != DATA {
        // This is a first time initialization, or previous initialization did
        // not complete succesfully.
        populateRoot(ptr)
    }
    println("Data is ", *ptr)
}

Suggestions welcome.

TODO - currently Make() and New() crashes if pmem is not initialized. This should be fixed.

vmwclabot commented 5 years ago

@jerrinsg, you must sign every commit in this pull request acknowledging our Developer Certificate of Origin before your changes are merged. This can be done by adding Signed-off-by: John Doe <john.doe@email.org> to the last line of each Git commit message. The e-mail address used to sign must match the e-mail address of the Git author. Click here to view the Developer Certificate of Origin agreement.

pratapsu commented 5 years ago

Jerrin: this change set seems to have two different changes in it. First, there is a changes to Init(). Then, there is also changes related to small and large logs. Why are these two mixed up into one pull request?

pratapsu commented 5 years ago

Jerrin - just in case you arent seeing what I'm seeing, here are the commands I executed ...

% git branch --list

% git diff master --name-only examples/database.go pmem/README.md pmem/pmem.go pmemtest/namedFuncs_test.go transaction/bitmap.go transaction/redoTx.go transaction/transaction.go transaction/undoTx.go txtest/txLog_test.go

% git diff master transaction/redoTx.go

will show you diffs that appear unrelated to any API changes. That's my concern.

pratapsu commented 5 years ago

Jerrin - just in case you arent seeing what I'm seeing, here are the commands I executed ...

% git branch --list

  • apiChanges codeCleanup master textFix travis

% git diff master --name-only examples/database.go pmem/README.md pmem/pmem.go pmemtest/namedFuncs_test.go transaction/bitmap.go transaction/redoTx.go transaction/transaction.go transaction/undoTx.go txtest/txLog_test.go

% git diff master transaction/redoTx.go

will show you diffs that appear unrelated to any API changes. That's my concern.

The command

% git diff master

will compare the top of the apiChanges branch with the top of the master branch. This is not what we need. We need to compare the top of the apiChanges branch to the base of the apiChanges branch (or, where the master used to be when the apiChanges branch was created from it). To do so, one needs to use:

% git diff master...

This produces the same list of modified files as what this web interface produces. Mystery solved. Use git diff master... from now for this purpose.

pratapsu commented 5 years ago

Now that I'm seeing the correct set of files, I don't understand why you changed the calls to PersistRange in undoTx.go to FlushRange. First, it appears unrelated to the API changes, Second, I don't see any justification for this change in your review request.

jerrinsg commented 5 years ago

Hi Pratap, The flush API call changes were part of #9. The reason PersistRange() was changed to FlushRange() in lines 243, 244, 278, 279, 363, 368, 419 is that these calls are followed by a call to updateLogTail which issues a store fence before updating the value of tail. flushRange() API only does a cache flush whereas PersistRange does a cache flush followed by a store fence. Therefore by making these changes, we can avoid a lot of unnecessary fence() calls.

jerrinsg commented 5 years ago

I have rebased the changes on top master just now.

pratapsu commented 5 years ago

Hi Pratap, The flush API call changes were part of #9. The reason PersistRange() was changed to FlushRange() in lines 243, 244, 278, 279, 363, 368, 419 is that these calls are followed by a call to updateLogTail which issues a store fence before updating the value of tail. flushRange() API only does a cache flush whereas PersistRange does a cache flush followed by a store fence. Therefore by making these changes, we can avoid a lot of unnecessary fence() calls.

Ok, I have no further comments on this change set.