graygnuorg / go-gdbm

Go interface to GNU DBM library
GNU General Public License v3.0
1 stars 0 forks source link

go-gdbm

This package provides Go API for GNU DBM, a library that implements a hashed database on a disk file. It aims to give a Go programmer access to all features of GDBM. Naturally, these features evolved over time, so special attention has been paid to supporting a wide variety of GDBM versions. At the time of this writing, the go-gdbm package is known to work with GDBM versions ranging from the recent 1.23 down to 1.12.

The package is written by a long-time maintainer of GDBM, Sergey Poznyakoff. It is an official GNU software.

Notice, that there exist several eponymous Go packages, with names differing ony in module prefix. This package is not based on any of them nor does it have any code in common with them.

Opening and Closing a Database

The simplest way to open a database is using the Open function:

   import "github.com/graygnuorg/go-gdbm"

   db, err := gdbm.Open("input.gdbm", gdbm.ModeReader)
   if err != nil {
       panic(err)
   }

The function takes two parameters: name of the database file and opening mode. The mode decides how to open the database and can have the following values:

If you need more control over the opening process, use the OpenConfig function instead. It takes a single argument: a DatabaseConfig structure that controls how the database should be opened or created. The structure has the following fields:

The two fields below are used only when creating a new database (i.e. when Mode is set to ModeNewdb or ModeWrcreat, and the database does not exist).

An example of using the OpenConfig function:

   import "github.com/graygnuorg/go-gdbm"

   db, err := gdbm.OpenConfig(gdbm.DatabaseConfig{FileName: "file.gdbm",
                          Mode: gdbm.ModeNewdb,
                          Flags: gdbm.OF_NOMMAP,
                          FileMode: 0600})
   if err != nil {
       panic(err)
   }

To close a database, use the Close method:

   db.Close()

Error Handling

Most GDBM function return a pair of values: an actual result and error status. Some functions return only error status. The error status is an object of type GdbmError - a modification of error which, in addition to the methods provided by the standard interface, provides the following:

The gdbm library exports errors corresponding to all GDBM error codes. Error names are derived from the corresponding GDBM error constant using the following algorithm:

  1. The initial GDBM_ prefix is removed.
  2. The remaining string is split on the underscore characters into a list of strings.
  3. In each obtained string, all characters except the first are converted to lower case.
  4. The strings are concatenated and prefixed with Err.

Thus, for example, original error code GDBM_CANNOT_REPLACE yields ErrCannotReplace.

The library defines an additional error: ErrNotImplemented. This error is returned if the called function is not supported by the version of libgdbm the library is linked with.

Error matching

Errors returned by GDBM functions can be matched (using the errors.Is method) against GDBM error constants and against system errors, if a system error is associated with the GDBM one. For example, the following code tests if opening the database failed because the database file didn't exist:

    db, err := gdbm.Open("in.db", gdbm.ModeReader)
    if err != nil {
    if errors.Is(err, gdbm.ErrFileOpenError) && errors.Is(err, os.ErrNotExist) {
        // Handle the error here
    }
    }

Looking up a Key

Both keys and values stored in the database are represented by the Go type []byte. To look up a key, use the Fetch method:

   func (db *gdbm.Database) Fetch(key []byte) (value []byte, err error)

On success, the method returns the value obtained from the database and nil as error. If the value is not found, the ErrItemNotFound error is returned. If an error occurred, another error code can be returned. In case of error return, the value returned is not specified.

An example of using this function:

    value, err := db.Fetch(key)
    if err == nil {
       // Use the `value`
    } else if errors.Is(err, ErrItemNotFound) {
       fmt.Println("key not found")
    } else {
       panic(err)
    }

Note: when using string value as a key, many applications account for the terminating \0 character when computing its length. When accessing such database files from Go, it is important to include the 0 byte into the key, e.g.

    value, err := db.Fetch(append([]byte(keyString), 0))

Storing a Key/Value Pair

The Store method stores a key/value pair into a database:

    func (db *gdbm.Database) Store(key []byte, value []byte, replace bool) error

On success, nil is returned. When attempting to store a key that already exists in the database, the behavior depends on the value of replace parameter. If replace is false the ErrCannotReplace error is returned. If replace is true, the new value will be stored, replacing the old one.

Example:

    err = db.Store(key, value, false)
    if err != nil {
    panic(err)
    }

Removing a Key/Value Pair

    func (db *gdbm.Database) Delete(key []byte) error

The Delete method returns nil on success. If the requested key was not found in the database, ErrItemNotFound is returned. Otherwise, an error describing the failure is returned:

    err := db.Delete(key)
    if err != nil && !errors.Is(err, ErrItemNotFound) {
    panic(err)
    }

Iterating Over All Keys

To iterate over all keys in the database, use the following approach:

    next := db.Iterator()
    var key []byte
    var err error
    for key, err = next(); err == nil; key, err = next() {
    // Do something with `key`
    }
    if !errors.Is(err, ErrItemNotFound) {
    panic(err)
    }

A usual caution regarding the body of the above loop is in order. You can use key to fetch the value and do any other operation that does not modify the database. Don't ever try to update the database while iterating over it (e.g. by deleting the key or modifying its value). Doing so will lead to some keys being visited twice or not visited at all.

Inspecting the Database

    func (db *gdbm.Database) FileName() (string, error)

The FileName function returns the name of the database. Using this function is handy, in particular, if the database was created from an ASCII dump file using OpenConfig with ModeLoad.

    func (db *gdbm.Database) Count() (uint, error)

The Count method returns the number of key/value pairs in the database.

    func (db *gdbm.Database) NeedsRecovery() bool

The NeedsRecovery method returns true if the database is structurally inconsistent and needs recovery. In this case, any other method invoked on that database will return an error. To recover the database, use the Recover method.

    func (db *gdbm.Database) LastError() error

Returns the last error that was detected when operating on the database.

Dumping a Database

GDBM databases can be converted to non-searchable flat files, also known as dumps, and re-created from such files. This can be used, for example, to create back-up copies or for sending the database over the wire. Two dump formats are supported: AsciiDump and BinaryDump. The latter is a simpler format, which is retained for backward compatibility with older versions of GDBM. The recommended format is AsciiDump.

To create a dump from an open database file, use the Dump method:

   func (db *db.Database) Dump(cfg gdbm.DumpConfig) error

The DumpConfig structure provides the necessary parameters:

A simplified method is provided, that uses default settings:

    func (db *gdbm.Database) DumpToFile(filename string) error

This method dumps the database to the specified file name in AsciiDump format. If the output file exists, it will be silently overwritten. If the file is created, its mode is set to 0666 modified by the system umask.

Loading a Database

There are two ways to re-create a database from an existing dump file. First, if the dump is in AsciiDump format, you can use OpenConfig function with the Mode set to ModeLoad. The FileName field must be set to the name of the dump file. For example, to re-create the database from the dump file staff.dump:

   db, err := gdbm.OpenConfig(gdbm.DatabaseConfig{FileName: "staff.dump", Mode: gdbm.ModeLoad})
   if err != nil {
       panic(err)
   }

Another approach is to use the Load method on an existing database:

    func (db *gdbm.Database) Load(cfg gdbm.DumpConfig) error

The method takes as an argument a DumpConfig structure, discussed above. Its fields have the following meaning:

The FileMode and Format fields are ignored.

A simplified interface is provided:

    func (db *gdbm.Database) LoadFromFile(filename string) error

It is equivalent to

    db.Load(DumpConfig{FileName: filename, Rewrite: true})

Recovering Structural Consistency

Certain errors (such as write error when saving stored key) can leave database file in structurally inconsistent state. When such a critical error occurs, the database file is marked as needing recovery. Subsequent calls to any GDBM methods on that database (except Recover and NeedRecovery), will return immediately with the ErrNeedRecovery code.

The Recover method attempts to recover the database into usable state:

    func (db *gdbm.Database) Recover(cfg gdbm.RecoveryConfig) (stat *gdbm.RecoveryStat, err error)

The RecoveryConfig structure controls the recovery process:

The following three fields are used when set to a non-0 value:

On success, the function returns a reference to a RecoveryStat value:

A simplified interface is provided by the Reorganize method:

    func (db *gdbm.Database) Reorganize() error

This method forces database recovery.

Examining and Changing Database Format

In GDBM version 1.21 or later, databases can be stored on disk in two distinct formats: standard format, and extended (or numsync), format. The extended format is recommended to use with crash tolerance mode.

To examine the current database format, use the IsNumsync method:

    numsync, err := db.IsNumsync()
    if err == nil {
    if numsync {
        // Database is in extended format
    } else {
        // Database is in standard format
    }
    } else {
    // An error occurred
    }

The format is normally determined when the database is created. By default, databases are created in standard format. To create a database in extended format, add the OF_NUMSYNC flag to the DatabaseConfig.Flags field:

    db, err := gdbm.OpenConfig(gdbm.DatabaseConfig{FileName: "file.db",
                           Mode: gdbm.ModeNewdb,
                           Flags: gdbm.OF_NUMSYNC,
                           FileMode: 0600})

To change the format of an existing database, use the Convert method. It takes a single boolean parameter. If the parameter is true, the database will be converted to extended format. If it is false, the database will be converted to standard format. If the database is already in the requested format, nothing will be done. E.g.:

   err := db.Convert(true)
   if err {
       panic(err)
   }

Synchronization

The Sync method synchronizes the changes in db with its disk file:

    func (db *gdbm.Database) Sync() error

Crash Tolerance

Crash tolerance is a new mechanism that appeared in GDBM version 1.21. This mechanism, when used correctly, guarantees that a logically consistent recent state of application data can be recovered following a crash. Specifically, it guarantees that the state of the database file corresponding to the most recent successful call to db.Sync() can be recovered.

The feature must be enabled at compile time, and requires appropriate support from the OS and the filesystem. The design rationale of the crash tolerance mechanism is described in detail in the article, Crashproofing the Original NoSQL Key-Value Store, by Terence Kelly,

For a database to be opened with crash tolerance support, it must reside on a file system that supports reflink copying, such as XFS, BtrFS or the like. Crash tolerance is requested by setting the CrashTolerance field of the DatabaseConfig structure to true. It is also strongly advised to use databases in extended database format. To do so, when creating the database, set the gdbm.OF_NUMSYNC bit in the DatabaseConfig.Flags field, e.g:

    db, err := gdbm.OpenConfig(gdbm.DatabaseConfig{FileName: "file.db",
                           Mode: gdbm.ModeWrcreat,
                           CrashTolerance: true,
                           Flags: gdbm.OF_NUMSYNC,
                           FileMode: 0600})

Once a database is opened in crash tolerance mode, two additional snapshot files are created in the directory where it resides. The file names are constructed by removing the suffix from the database file name and appending suffixes .s1 and .s2 instead. These files are removed when the database is closed. If, however, the database is not closed properly, e.g. because of the program crash or power failure, the files will remain on disk. In this case, one of them will keep the state of the database at the moment of the most recent call to db.Sync().

To ensure that the snapshots keep a logically consistent state of the database, care must be taken to call db.Sync() at right places. Follow the discussion in GDBM manual to ensure the database is synchronized correctly.

When called with DatabaseConfig.CrashTolerance set to true, the gdbm.OpenConfig function first checks if the snapshot files exist. If so, it fails with the gdbm.ErrSnapshotExist error. When this happens, the caller is supposed to run the database recovery, by invoking the SnapshotRestore function:

    func SnapshotRestore(filename string) error

This function takes the file name of the database file as its argument. It analyzes both snapshots to select the right one, and recovers the database from it. On success, SnapshotRestore returns nil. On error, it returns one of the following errors:

If any of these are returned, you are advised to attempt manual crash recovery.

The following code snippet illustrates the usual sequence used when opening a database in crash tolerance mode:

    db, err := gdbm.OpenConfig(gdbm.DatabaseConfig{FileName: filename,
                           CrashTolerance: true,
                           Mode: gdbm.ModeWrcreat,
                           FileMode: 0666,
                           Flags: gdbm.OF_NUMSYNC})

    if err != nil {
        if errors.Is(err, gdbm.ErrSnapshotExist) {
            // Database was not closed properly.  Attempt a crash recovery:
            err = gdbm.SnapshotRestore(filename)
            if err != nil {
                fmt.Printf("Can't restore %s: %s\n", filename, err.Error())
                fmt.Printf("Manual crash recovery is advised.\n")
                os.Exit(1)
            }
            // Database was successfully recovered, now try to open it:
            db, err = gdbm.OpenConfig(gdbm.DatabaseConfig{FileName: filename,
                                  CrashTolerance: true,
                                  Mode: gdbm.ModeWrcreat,
                                  FileMode: 0666,
                                  Flags: gdbm.OF_NUMSYNC})
            if err != nil {
                panic(err)
            }
        } else {
            // Another error.
            panic(err)
        }
    }

Informative Functions

    func Version() []int

This function returns the version of the libgdbm library the package is linked with. The returned array has 3 elements: major and minor version numbers, and patch level number.

    func VersionString() string

Returns the version of the libgdbm library as a formatted string. The returned string starts with words GDBM version, followed by a space and the version numbers delimited by dots. If patchlevel is 0, it is omitted. The version number is followed by the date of the library build,