Wrong total email count on grafana & kibana dashboards

cpuga commented 4 years ago

Using date_range as timestamp field, instead of separate begin_date & end_date fields, leads to elasticsearch to include each document twice in the buckets of date aggregation queries. That is, once in the bucket corresponding to the begin date, and the same for the end date.

This causes all date histogram visualizations to be wrong, as they say about twice as many messages as they actually are.

You can see this in the sample grafana screenshot provided by @bhozar (3557 emails sent on "Email Count" vs 1308+471+1 on "Message Volume by Header From"). https://grafana.com/api/dashboards/11227/images/7191/image

Keeping date_range if you want, but adding a begin_date filed, would solve the problem.

seanthegeek commented 4 years ago

Thanks for point this out! I'm surprised no one else (including me) noticed, I'll work on fixing this

RedJohn14 commented 4 years ago

Hello everyone, I have the same problem on my Grafana. Any idea how I can fix this issue? Because I have double E-Mail count.

bhozar commented 4 years ago

@RedJohn14 @seanthegeek Change the "min interval" to 1d to resolve. I'll be publishing a new dashboard soon, but trying to include some additional info and some panels to make some of it clearer. Have also changed the min interval to fix the issue where required.

bhozar commented 4 years ago

Can get the fixed Grafana dashboard here: https://github.com/bhozar/grafana-dashboards/tree/master/parsedmarc

It requires Grafana 7.1, and has quite a few other changes.

ericwbentley commented 4 years ago

I experience the same issue in Kibana when creating custom visualizations or using the Discover view. Would it be possible to split out begin and end dates into their own fields? While also leaving the date range field for those who need it. Then we can use begin date as the timestamp for the index pattern and avoid double counting regardless of interval.

Maybe an option that could be made available in the config file?

ericwbentley commented 4 years ago

I found even with daily interval, since some reports are sent with a range greater than one day, double counting is still possible. This can be worked around in visualizations by using a sum of the message count field instead of count of all records, which would show a more accurate count of emails passing/failing anyway.

But the problem can be avoided entirely by using the start or end time of the range for the index pattern timestamp instead of a range with two values. I made the below changes to add the fields, then re-created my index pattern and used date_end as the timestamp field. Old reports would have to be re-indexed to make use of it.

/usr/local/lib/python3.6/site-packages/parsedmarc/elastic.py

class _AggregateReportDoc(Document):
    class Index:
        name = "dmarc_aggregate"

    xml_schema = Text()
    org_name = Text()
    org_email = Text()
    org_extra_contact_info = Text()
    report_id = Text()
    date_range = Date()
    date_begin = Date() #add date begin
    date_end = Date() #add date end

    for record in aggregate_report["records"]:
        agg_doc = _AggregateReportDoc(
            xml_schema=aggregate_report["xml_schema"],
            org_name=metadata["org_name"],
            org_email=metadata["org_email"],
            org_extra_contact_info=metadata["org_extra_contact_info"],
            report_id=metadata["report_id"],
            date_range=date_range,
            date_begin=aggregate_report["begin_date"], #add date begin
            date_end=aggregate_report["end_date"], #add date end

domainaware / parsedmarc

Wrong total email count on grafana & kibana dashboards #162