Ch. 8, Hexagonal Architecture, Restore() func not properly loading existing data.

adamwoolhether commented 2 years ago

Just wanted to inquire if anyone else Is having trouble with getting the FileLogger to load an existing log file.

If there is no existing file, I can run the program fine, it creates the log file, and writes to the file.

The problem lies in trying to run the program again, in which it hangs.

I've traced the problem to getting stuck the the Restore() function's for loop:

    for ok && err == nil {
        select {
        case err, ok = <-errors:
        case e, ok = <-events:
            switch e.EventType {
            case EventDelete:
                err = store.Delete(e.Key)
                count++
            case EventPut:
                err = store.Put(e.Key, e.Value)
                count++
            }
        }
    }

It's driving me crazy, wanted to see if anyone else had similar issues, or any idea on how to properly debug the issue.

davidrhp commented 2 years ago

@adamwoolhether

I am having the same issue at the moment, but I think I have found the reason:

During the restore phase, events from the file are written to the store. However, since the logging to the file has been moved to the store functions themselves, a write to the store triggers a write to the log:

func (store *KeyValueStore) Put(key string, value string) error {
    store.Lock()
    store.m[key] = value
    store.Unlock()

    store.transact.WritePut(key, value) // --- here

    return nil
}

Now, the logger's Run method has to be called to properly initialize it, i.e., the event and error channels have to be created. But this is done after all the events from the file have been processed and were added to the store. So these channels are nil (I am not talking about the channels received from a call to the FileLogger's ReadEvents method.

The FileLoggers Put looks like this:

func (l *FileTransactionLogger) WritePut(key, value string) {
    l.wg.Add(1)
    l.events <- core.Event{EventType: core.EventPut, Key: key, Value: url.QueryEscape(value)} // Go will block forever, as it can't send to a nil channel
    l.wg.Done()
}

Since the program can't write to a nil channel, it will block forever, thus causing it to hang indefinitely.

To confirm this, I paused the debugger manually while the program didn't seem to continue, and looked at last stack. Looking at its variables, you'll see a runtime.waitReason with the value of waitReasonChanSendNilChan.

I am very very new to Go, so I don't have an elegant solution that fits cleanly into the architecture yet, but I think these channels just have to be created before the FileLogger's Put/Delete methods are called. However, this may lead to a duplication of the log, so it is probably better if the Put/Delete methods just become no-ops before the Logger's Run method is called.

One way would be to keep the ZeroTransactionLogger in the KeyValueStore until the Restore method completed. Maybe the Store's WithTransactionLogger method could be called a bit later in the main method, but then the channels from ReadEvents would be nil instead, since that's the implementation of the ZeroTransactionLogger.

EDIT:

I fixed the issue on my end by removing the call to WithTransactionLogger and instead passing the logger to the Restore method:

main.go:

    store := core.NewKeyValueStore()
    if err := store.Restore(tl); err != nil {
        log.Fatalln(err)
    }

And then adjusting the Restore method by reading the events from the passed in transaction logger, but not adding the logger to the store until it has finished reading the events from the file and storing them. It is then called before the logger's Run method. This way, previous calls to Put/Delete of the logger are no-ops, since they are called on the ZeroTransactionLogger. The actual FileTransactionLogger is then set to the store before the Run method, so the logger works as expected:

core.go - call to ReadEvents:

func (store *KeyValueStore) Restore(tl TransactionLogger) error {
    var err error

    events, errs := tl.ReadEvents(). // note, tl is called directly, not on the receiver
    count, ok, e := 0, true, Event{}

        // code omitted

core.go - setting the store before calling Run:

    log.Printf("%d events replayed\n", count)

    store.transact = tl
    store.transact.Run()

However, I do not like this approach as it obscures the behavior of the Restore method. One has to know that the actual logger argument is set later, to take advantage of the fact that the default ZeroTransactionLogger is a no-op. At the same time, the argument is used for reading the events, but it is a solution.

Hope this helps somewhat if others are stuck here as well.

mehdieidi commented 2 years ago

@adamwoolhether @davidrhp

I had the same issue and I tried a simple approach to solve it but since I'm not an expert, this approach might be wrong. I appreciate if someone enlightens me.

I changed the signature of the core Put and Delete methods and added a boolean parameter called restore. If the process of Putting or Deleting is associated with the Restoring of data, using this boolean, it avoids writing to the log. In this way, no other changes are needed and no duplicate logs are written.

func (store *KeyValueStore) Delete(key string, restore bool) error {
    store.Lock()
    delete(store.m, key)
    store.Unlock()

    if !restore {
        store.transact.WriteDelete(key)
    }

    return nil
}

func (store *KeyValueStore) Put(key string, value string, restore bool) error {
    store.Lock()
    store.m[key] = value
    store.Unlock()

    if !restore {
        store.transact.WritePut(key, value)
    }

    return nil
}

menxqk commented 2 years ago

Same issue here. The TransactionLogger channels are nil when Restore() is called and that causes Restore() to block forever. I changed the code in Restore()'s loop to execute Put and Delete on the in-memory map directly. Maybe that was the real purpose of Restore(), rebuilding state on the in-memory map.

    for ok && err == nil {
        select {
        case err, ok = <-errors:

        case e, ok = <-events:
            switch e.EventType {
            case EventDelete: // Got a DELETE event!
                // err = store.Delete(e.Key)
                store.Lock()
                delete(store.m, e.Key)
                store.Unlock()
                count++
            case EventPut: // Got a PUT event!
                // err = store.Put(e.Key, e.Value)
                store.Lock()
                store.m[e.Key] = e.Value
                store.Unlock()
                count++
            }
        }
    }

cloud-native-go / examples

Ch. 8, Hexagonal Architecture, Restore() func not properly loading existing data. #10