Automated alerts - Githubissues

ewels commented 6 years ago

If we're tracking data over time, we should be able to set manual limits and send automated alerts when a report is submitted with samples that exceed these.

For example: % aligned must be > 80%. If a sample is uploaded with 45% aligned, trigger an alert.

Would be good to have alerts in browser (eg. notifications) and also via email.

multimeric commented 5 years ago

I think this is the key feature MegaQC needs to be useful for a lot of labs. I'm happy to discuss how this might be implemented.

We would need:

a new table that stores thresholds for any given entry in the sample_data_type table, under/above which an alert is sent
a new page for editing these thresholds
add a hook to the upload endpoint, where we run this validation, and if it fails we send an alert

Incidentally, I've seen tools that have a warning and and error threshold, which have different levels of importance. This might be a useful feature too.

multimeric commented 5 years ago

I've had a think about the data model for alerts, and it's a bit annoying because they could be arbitrarily complex. At their most basic level, we can have:

import enum
class ThresholdOperation(enum.Enum):
    lt = 1 # <
    lte = 2 # <=
    eq = 3 # =
    gt = 4 # >
    gte = 5 # >=

class DataThreshold(db.Model, CRUDMixin):
    __tablename__ = "data_threshold"
    data_threshold_id = Column(Integer, primary_key=True)
    sample_data_type_id = Column(Integer, ForeignKey('sample_data_type.sample_data_type_id'))
    operator = Column(Enum(ThresholdOperation))
    value = Column(Unicode)

But then the question is, is one operator enough to represent a threshold? What if a metric only passes if it's in between two values? We could make it a many-to-one relationship between thresholds and data types, which would let you say "column less than 100", and "column greater than 50", which would solve that issue. But then what if we need boolean arithmetic with "or" statements and recursive grammars...

Any idea how much power is needed here?

ewels commented 5 years ago

Ooh and I can think of other cases too - eg. a value within a range a certain number of times, or for a certain period of time. Or the average value over a period of time being within a range.. Even a value agnostic mode which just looks for changes in patterns, eg. a sudden drop of an otherwise stable value.

Any idea how much power is needed here?

The million dollar question! My opinion would be that something is better than nothing. So if in doubt, start simple with something that works. But good to try to keep one eye on the future when deciding on architecture..

ewels commented 5 years ago

Also, I think we need a table to store the alerts themselves. Then we can have a little notifications icon to view when people log in and dismiss, as well as just pushing alerts via email.

multimeric commented 5 years ago

Hmm that's not very reassuring at all. Especially because you're introducing the idea of aggregate expressions by considering averages. This kind of alert ("when the average drops below X") is a lot more complex anyway, because we can't just check it on every incoming sample.

multimeric commented 5 years ago

The ultimate flexibility would be to let you write an expression (in Python or SQL?) for your alerts, but that's not user friendly or easy to implement. On the other hand a basic data model won't capture those more advanced alerts... still I guess it's a start.

multimeric commented 5 years ago

My current vision of a fully flexible alert builder would work like this:

We have a basic SQL SELECT statement, that starts like this:

SELECT * FROM sample JOIN sample_data JOIN sample_data_type

Then, we allow the user to customize the WHERE clause that is added to this select statement, using some kind of query builder or language (not raw SQL)
Firstly, they must add choose a data type to focus on (e.g. fastq GC content), this adds something like WHERE sample_data_type.data_key == "fastqc__%GC" to the query
Next, the user can add other expressions to the WHERE clause. If they just wanted a simple alert they could add AND sample_data.value > 70
If the query is not in aggregate mode, we create an alert for each row returned by the query
If the query is in aggregate made, we group all the rows we currently have together
If the user did decide to do an aggregate query, the user has to add some expressions to the HAVING clause, so they can add something like HAVING avg(sample_data) > 50 or HAVING count(sample_data) > 10
If this aggregate query returns True, then we create an alert

So basically, the user has to input:

The data type they're writing an alert for
Some number of WHERE statements that filter down the data
[Optionally] if they choose an aggregate alert, some number of statements in the HAVING clause.

The reason I particularly like this solution is that I can implement the first half (WHERE clauses without aggregates), which will solve like 80% of the use cases, and then the aggregate mode could be added later.

multimeric commented 5 years ago

In particular, we could use a query builder like this one: https://querybuilder.js.org, which we instantiate once for the WHERE clause, and once for the HAVING clause.

ewels commented 5 years ago

Sounds good!

aggregate expressions by considering averages [..] is a lot more complex anyway, because we can't just check it on every incoming sample.

I don't really understand why we can't check it on every incoming sample? Surely it's the same, just that we also consider the previous n samples in that calculation?

The query builder thing looks quite a lot like the MegaQC interface that I built to filter samples. Maybe we could reuse that? In fact it should basically be exactly the same code that returns the data values for the box plot. Could maybe just add on to that page? But maybe this is overcomplicating things. If you're happy to work with your proposed plan above then I say go for it, sounds good to me 👍

@tsnowlan - any thoughts?

multimeric commented 5 years ago

I don't really understand why we can't check it on every incoming sample? Surely it's the same, just that we also consider the previous n samples in that calculation?

To me, the thing with alerts is that they actually have to be checked in the database, because, if you add a new alert, update an alert, or add some more data, we want to be able to re-check for threshold violations efficiently. For that reason, I think it's best to model them as SELECT queries over thesample_data table. However, by doing this, it means we have to consider aggregate queries (mean, count etc) as SQL aggregate queries (with HAVING), which makes it somewhat more complex

multimeric commented 5 years ago

The query builder thing looks quite a lot like the MegaQC interface that I built to filter samples. Maybe we could reuse that? In fact it should basically be exactly the same code that returns the data values for the box plot. Could maybe just add on to that page?

Yes I'm always happy to try to re-use code. I can refactor the filter modal into a re-useable component that can be used for filters and alerts.

Incidentally, had you ever considered rewriting the the frontend into something like a React SPA, rather than jinja + jQuery

multimeric commented 5 years ago

The filter modal could be a nice React component that I re-use for filters and alert thresholds, if you're open to that possibility. Then I could clean up the painful HTML generation in megaqc/static/js/filter_samples.js

ewels commented 5 years ago

However, by doing this, it means we have to consider aggregate queries

I had assumed that we would just have a Python class that runs a bunch of different queries. I'm a little wary of very complex SQL here as the intention was to provide support for several SQL engines - SQLite doesn't support everything for example. But really, whatever works..!

Incidentally, had you ever considered rewriting the the frontend into something like a React SPA, rather than jinja + jQuery

Yes - would be nice, but I'm a bit scared that it would be a huge amount of work? And not entirely sure if the benefits would outweigh that. Flask sites typically use jinja so I took the path most travelled. I've not used react before though, only played with Angular but that's all. I'd be happy to learn :) I guess this is something that we (you) could start in parallel in a non-destructive manner though?

multimeric commented 5 years ago

I'm a little wary of very complex SQL here as the intention was to provide support for several SQL engines

I don't think it would be complex SQL, and I suspect SQLite would have no issue with it. However for now I'm happy to keep everything as-is, especially since you've already created a filtering component for the plots that I can re-use here. But I would like to point out for future direction that I think these filters should be implemented as SQL queries for the sake of optimization. You could even write an SQL interpretation layer for the existing JSON filters you've got, allowing MegaQC to transparently transition to this.

Yes - would be nice, but I'm a bit scared that it would be a huge amount of work?

I don't actually think this would be an outrageous amount of work. And you can progressively react-ify an app by rewriting it a component at a time. But until I can commit a large amount of time to work on MegaQC I'll avoid sweeping changes like this. I think the Flask + SQLAlchemy is totally fine, but using React instead of Jinja + jQuery would increase maintainability a whole lot, I think.

multimeric commented 5 years ago

Related to #78: I think one of the goals of the thresholds/alerts would be to support the use of Westgard rules, e.g. (source)

This would be doable if we supported multiple aggregate filters, as discussed above, but I don't think this is implementable with the current per-sample filters.

I don't think this is core functionality, but it would be a nice long term goal.

multimeric commented 4 years ago

A nice and simple way I might implement this is via the filter language we have for MegaQC. I'll create a new filter type called "pattern_rule" or something, so you can add a filter for visualisation or alerts (once #104 is done) that triggers whenever your data hit one of the criteria. e.g.

{
    "type": "pattern_rule",
    "value": "41s" 
}

MultiQC / MegaQC

Automated alerts #28