dgm9704 / Xoxo

reading, writing and comparing XBRL reports
GNU Lesser General Public License v3.0
26 stars 8 forks source link

Xml/Fact Equals does not consider contextRef #83

Open bigsmity opened 4 months ago

bigsmity commented 4 months ago

Xml/Fact Equals does not consider contextRef. I would not consider these facts duplicates if they do not share context.

`public bool Equals(Fact other) { var result = other != null && Value.Equals(other.Value, StringComparison.Ordinal) && Metric.Equals(other.Metric) && Decimals.Equals(other.Decimals, StringComparison.Ordinal);

        if (result)
            result = Unit == null
                ? other.Unit == null
                : Unit.Equals(other.Unit);

        return result;
    }`
dgm9704 commented 4 months ago

Thank you for getting in touch! I will take a look at this as soon as I can, probably tomorrow.

dgm9704 commented 4 months ago

Okay, you're not wrong. When comparing facts, it seems logical that the contexts should be taken into account. Unfortunately the comparison can't be done with just contextRef alone, since there can be two contexts in the same report with different id but same scenario, or two contexts in different reports with same id but different scenarios. So the comparison would need to also include the "deep" comparison of the contexts, and this cause performance issues. This could be shortcircuited eg. same report + same contextRef = same context, etc. So the comparison needs a bit more logic than is obvious at the surface level. (Which is probably why I left it out in the first place) I tried something quick and dirty just to get some feel for the worst case performance hit, BUT that resulted in a test failing, which means I have a possible bug to sort out first. So, I recognize the issue and am working on it, but it might take some time.

dgm9704 commented 4 months ago

Ok, I think I found the bug and a fix for it. But the bad news is that the simplest addition that includes contexts when comparing facts, slows the comparisons to a crawl, increasing time taken to something like 50 times more, which is too much for any decent size report. I will work on the short circuiting, and maybe some memoization, and probably some other enhancements that hopefully would make this feasible. And I will need to think about the relationship of facts in different reports also.