aaOpenSource / aaLog

A library and example programs for reading the standard SMC log file format
MIT License
13 stars 15 forks source link

aaLogReader multi-threaded issues and aaLogWebAPI errors #27

Closed ryanvs closed 8 years ago

ryanvs commented 8 years ago

I probably should have done this on multiple branches and then used multiple pull requests, but I didn't think about it at the time. :(

I tried to fix some of the multi-threaded issues in aaLogReader before I realized the multi-user issues make this effort obsolete. We will need to redesign aaLogReader as I mentioned in #26.

I changed the exception handling in aaLogWebAPI to return an ODataError from the method. I did some testing and the error messages are better this way, in my opinion, since you get a really long stack trace if you don't generate your own error.

Finally, I was working on some of the aaLogReader.Tests to address the time zone issue. I modified the tests to Cleanse the JSON objects of values that may differ across systems, as mentioned in #22. I just realized I didn't test all of the ToKVP() style methods, and the change to DateTimeOffset may affect the output.

I updated to NUnit 3.0.1 as it was recently released. I didn't have NUnit installed on my system, so I downloaded the latest and then had to update the project, and also install NUnit3 Test Adapter to get my test results in Visual Studio. I hope that doesn't cause problems!

Also, I updated some projects to .NET v4.5.2 since 4.5 and 4.5.1 reach the end of life support as of Jan 1, 2016.

arobinsongit commented 8 years ago

I'll try to take a look at all of this this weekend. Thanks for being the poster child for why you should have small PR's! All joking aside at least you are doing this via a PR instead of committing the master branch like I do :-).

arobinsongit commented 8 years ago

Ryan, I'm working through the PR and I have a question about your work with times. I've been thinking about how we are managing times. Right now we have a File Time and we convert that to a Date Time. That date time is always local. My question is how can we arbitrarily convert this to UTC when we don't know the time zone offset? Or, are the functions you are using in the framework intelligent enough to do this math?

ryanvs commented 8 years ago

According to MSDN, FileTime is stored in UTC so using the DateTime.FromFileTime values it automatically converted from UTC to Local. Originally I thought about staying with DateTime but then thought it might be good to use DateTimeOffset in case we aggregate logs from multiple servers in different time zones. I didn't look through the Header, but if it doesn't contain the time zone offset then there is no easy way of knowing the time zone of the source other than querying it. You know the UTC time but not the source time zone (and it may not matter, but I like to have it anyway if possible).

That being said, I updated the Unit Tests and everything seemed to work, but I didn't compare any results to other log files from my computer or others to make sure it really worked.

ryanvs commented 8 years ago

Next time I will use smaller commits and PRs!

arobinsongit commented 8 years ago

So we've got a pretty serious merge conflict around the unit tests. @logic-danderson made some changes that seemed reasonable to help avoid potential issues with comparing unnecessary fields. However, taking at look at what @ryanvs has done I think actually may make more sense. It keeps the code a lot more compact and clean. I probably should have studied this PR more before merging @logic-danderson PR.. that's ok.

So @logic-danderson and @ryanvs what do you think? My inclination is to go the direction @ryanvs has taken things with testing but I would like to get some consensus first.

logic-danderson commented 8 years ago

The code in this PR gets the tests to pass by removing the properties that differ by machine. So the tests pass but they are not as thorough. And the specs are kept in different files, so the specifics of what's being tested are hidden.

The benefit of unit testing the way I did is you can work around the differences between our machines and still test the object properties. You can also see exactly which properties are being tested and what the expected values are. The tradeoff is more lines of code per test.

Test code doesn't necessarily follow the same standards as production code. For production code, yeah, be clean and compact. But test code is different. Clarity of intent is more important.

Anyway, that's what I think based on my experience with unit testing on .NET development teams in the past.

I'm all for the change to DateTimeOffset, though. Good move. Since @ryanvs pointed out that FileTime is stored as UTC, I think the time will get translated to the appropriate time zone local to the machine that is running the aaLog library (since that machine knows which time zone it is in).

arobinsongit commented 8 years ago

And @logic-danderson you also make some pretty valid arguments as to why we should go with your lead. I hadn't thought about the concept that unit tests don't necessarily need to be neat and compact, and maybe more verbosity is valuable in the interest of pointing out exactly what failed.

Steel cage death match to decide?

ryanvs commented 8 years ago

I haven't had time to look at @logic-danderson 's changes yet so I can't really comment. In my unit tests, I also created a function to remove the extraneous properties for comparison purposes. Also I chose to compare the JSON objects because I think (not sure, but I think) it does more of a logic comparison rather than a strict string or binary comparison. I'll try to take a look at the changes today, but that may be optimistic with my vacation and family activities.

ryanvs commented 8 years ago

TL;DR - I'm going to modify my PR to use a method to compare the properties, and also override .Equals and .GetHashCode.

The issue I have with the verbose-style unit tests is the amount of maintenance. My preference would be to either:

  1. manually build an expected LogRecord object and build an Equals or Compare method
  2. manually build an expected JSON object and compare the actual and expected objects

In this scenario, I think the LogRecord class is not going to change significantly but in other scenarios you add, remove, and change properties and methods. Using a verbose style leads to a lot of maintenance when the underlying class changes. Instead create a method that compares each individual property and just call the compare method to test the actual and expected objects. You will still know when a specific property causes a failure. Here is an example:

Verbose Style

With this style, you need to repeat the same block of code for the comparison except you need to change all of the actual values as well. This violates the "Don't Repeat Yourself" (DRY) principle in my opinion.

    [Test]
    public void GetFirstRecord()
    {
        _logReader.OpenLogFile(LOG_FILE_INSTANCE);

        LogRecord lr = _logReader.GetFirstRecord();
        Assert.That(lr.MessageNumber, Is.EqualTo(60380), "MessageNumber is wrong");
        Assert.That(lr.ProcessID, Is.EqualTo(9672), "ProcessID is wrong");
        Assert.That(lr.ThreadID, Is.EqualTo(12224), "ThreadID is wrong");
        Assert.That(lr.EventFileTime, Is.EqualTo(130943708747330261), "EventFileTime is wrong");
        Assert.That(lr.LogFlag, Is.EqualTo("Info"), "LogFlag is wrong");
        Assert.That(lr.Component, Is.EqualTo("aaLogger"), "Component is wrong");
        Assert.That(lr.Message, Is.EqualTo("Logger Started."), "Message is wrong");
        Assert.That(lr.ProcessName, Is.EqualTo(""), "ProcessName is wrong");
        Assert.That(lr.SessionID, Is.EqualTo("0.0.0.0"), "SessionID is wrong");
        Assert.That(lr.HostFQDN, Is.EqualTo(_expectedFqdn), "HostFQDN is wrong");
    }

Compare Method

This example uses LogRecord for demonstration, but you could do the same with a JSON object.

    public void AreEqual(LogRecord expected, LogRecord actual)
    {
        Assert.AreEqual(expected.MessageNumber, actual.MessageNumber, "MessageNumber is wrong");
        Assert.AreEqual(expected.ProcessID, expected.ProcessID, "ProcessID is wrong");
        Assert.AreEqual(expected.ThreadID, expected.ThreadID, "ThreadID is wrong");
        Assert.AreEqual(expected.EventFileTime, actual.EventFileTime, "EventFileTime is wrong");
        Assert.AreEqual(expected.LogFlag, actual.LogFlag, "LogFlag is wrong");
        Assert.AreEqual(expected.Component, actual.Component, "Component is wrong");
        Assert.AreEqual(expected.Message, actual.Message, "Message is wrong");
        Assert.AreEqual(expected.ProcessName, actual.ProcessName, "ProcessName is wrong");
        Assert.AreEqual(expected.SessionID, actual.SessionID, "SessionID is wrong");
        // This is not really a property of the source LogRecord, but of the reader
        //Assert.AreEqual(expected.HostFQDN, Is.EqualTo(expectedFqdn), "HostFQDN is wrong");
    }

    [Test]
    public void GetFirstRecord()
    {
        _logReader.OpenLogFile(LOG_FILE_INSTANCE);

        LogRecord actual = _logReader.GetFirstRecord();
        // You can read the LogRecord from file or manually construct it as shown here
        LogRecord expected = new LogRecord()
        {
            MessageNumber = 60380,
            ProcessID = 9672,
            ThreadID = 12224,
            EventFileTime = 130943708747330261,
            LogFlag = "Info",
            Component = "aaLogger",
            Message = "Logger Started.",
            ProcessName = "",
            SessionID = "0.0.0.0",
        };
        AreEqual(actual, expected);
    }

As it stands, the code in my PR assumes the Assert.AreEqual method will compare the JSON objects and also show where the failure occurred. While I was working on it, the assert failure did say where the failure occurred but it seemed to serialize the JSON to a sting and compare the strings. So it showed the position of the failure in the string for EventDateTime for example.

Looking at StackOverflow, the two most popular answers are (1) override .Equals (in LogHeader, LogRecord, etc.) and (2) create your own comparison method (similar to my example) - see Compare equality between two objects in NUnit.

The difference between the Compare Method style and my PR is that I stopped at the JSON object. It is fairly simple to deserialize the expected JSON and compare objects instead of JSON.

This is a lot longer than I intended, so my recommendation is that I will modify my PR to compare objects and create an AreEqual method for all compared classes (LogHeader, LogReader). I also like the idea of overriding .Equals and .GetHashCode in the classes in case we ever want to use these classes as a dictionary key or use List.Contains or similar methods.

ryanvs commented 8 years ago

There is a lot I like from @logic-danderson 's changes, but I've also look more at the StackOverflow like some of the suggestions in the link I provided: Compare equality between two objects in NUnit. There were a couple of comments that you shouldn't write .Equals and .GetHashCode methods just for testing, and I'm not sure that we will use these methods outside of testing. Another more interesting suggestion is to use FluentAssertions as made in this answer. It looks like FluentAssertions can also make nice error messages in the tests.

The other thought is instead of .Equals, we could possibly create a LogRecordComparer based on IEqualityComparer as shown in this answer.

At this point I'm tempted to withdraw my PR and resubmit as three separate PR's and rebase from the current head.

arobinsongit commented 8 years ago

I like your direction @ryanvs . I agree that @logic-danderson approach to a more thoughtful comparison is on the right track.

I also like your idea of breaking up your PR into separate ones. There are definitely some good chunks you've contributed outside of testing that I'd like to get into the code. And besides I think I would suck at handling the merge conflict so this way I don't have to :-)

Thanks!

arobinsongit commented 8 years ago

Oh, and Fluent Assertions look really really nice. I like the concept of writing tests in a language that flows logically.