COVID-19-electronic-health-system / Corona-tracker

An easy-to-use PWA to monitor the user's wellness and learn about COVID-19.
https://coronatracker.me/
MIT License
235 stars 101 forks source link

[FEAT] Add a Graph to visualize trends #683

Open tesla809 opened 4 years ago

tesla809 commented 4 years ago

⚠️ IMPORTANT: Please fill out this template to give us as much information as possible to consider/implement the feature.

Prerequisites

Summary

Based on ROADMAP-TO-MVP

Why is a graph useful Being able to see the trend of temperature and heart rate change over time is SUPER useful for the doctor.

Task for graph: Add a graph or some easy visual that shows trends of temperature and heart rate over time.

The trend of these two measurements is SUPER important to assess the possibility of being affected by COVID, based on @DocMusher#9988 experience.

Make it look CLEAN and LOGICAL since this is an ESSENTIAL feature that doctor and user will interact with.

Use the best library you see fit.

The trend is your friend.

Motivation

Why are we doing this? Being able to see the trend of temperature and heart rate change over time is SUPER useful for the doctor.

What use cases does it support? Hitting specifications for MVP to be usable in the field by Belgian doctor DocMusher. See: ROADMAP-TO-MVP

What is the expected outcome? The user and doctor can see the trends of Pulse and temperature over time in a clear and intuitive way. Make it look CLEAN and LOGICAL since this is an ESSENTIAL feature that doctor and user will interact with.

Possible Alternatives

None, this is a core feature specified by first field user to reach MVP state.

Additional Context

Based on ROADMAP-TO-MVP

AdhamAH commented 4 years ago

If I understand the issue right, we already have this for both TEMPERATURE and Behavioral We can add it with the TEMPERATURE or make it separate Screenshot 2020-05-01 at 13 13 04

Screenshot 2020-05-01 at 13 13 10

SvenVanPoucke commented 4 years ago

Looking Nice! Just Remember all subjective parameters are subjective and as guidance tool to help people in the context of this pandemia, they are important from a psychology scope, we as I see it need to provide a guide to assure people there is no need to worry medically based on temperature and heart rate trends which can be shown to the doctor, as glucose diary for diabetic patients. Do not forget how people feel are emotions which have a relationship with your general health but not always, mostly with unknown non linear correlation, and a huge interindividual variability which is less the case for temperature and heart rate observing in periods of days or weeks. Sven

Op vr 1 mei 2020 13:15 schreef Adham Abo Hasson notifications@github.com:

If I understand the issue right, we already have this for both TEMPERATURE and Behavioral We can add it with the TEMPERATURE or make it separate [image: Screenshot 2020-05-01 at 13 13 04] https://user-images.githubusercontent.com/55054963/80801765-ab6ef480-8bad-11ea-8c21-86c03e3a872a.png

[image: Screenshot 2020-05-01 at 13 13 10] https://user-images.githubusercontent.com/55054963/80801774-af9b1200-8bad-11ea-80c0-58dea73fdbed.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/COVID-19-electronic-health-system/Corona-tracker/issues/683#issuecomment-622348668, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABOUXYV2DIG4KYCGHAY4Y5LRPKVNPANCNFSM4MWHAJ4Q .

SomeMoosery commented 4 years ago

I feel like we should add some more detail about action items to take with this story, otherwise it's difficult to deduce what needs to be completed and when this issue is "done" / ready for a PR. Otherwise we should probably close this and make it a bit more pointed.

SvenVanPoucke commented 4 years ago

FYI

---------- Forwarded message --------- From: Data Science Briefings briefings@dataminingapps.com Date: Mon, 18 May 2020 at 10:49 Subject: Data Science Briefings #116: RFM Analysis Revisited To: svanpoucke@gmail.com

Is this email not displaying correctly? View it in your browser https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiIlN0JzcC1icm93c2VyLXVybCU3RCJ9/ [image: DataMiningApps Logo] Data Science Briefings

Updates on the latest news, trends, techniques, tools and our research in data mining and analytics In this Newsletter

Editorial

Dear fellow data scientists, analytics lovers, and friends,

We hope you and yours are remaining safe and sound.

The RFM framework has already been popular since its introduction by Cullinan in 1977. It’s a well-known and well-developed measurement framework used in marketing across different industries such as banking, insurance, Telco, non-profit, travel, on-line retailers, and even government. It consists of a set of metrics to monitor customers’ behaviour so as to develop suitable customer relationship management or CRM strategies. In this issue’s feature article, we revisit this popular framework and take a look at how you can use it in machine learning.

We hope you enjoy this issue of Data Science Briefings. We always like to receive feedback as well as suggestions or contributions. Just reply to this e-mail if you wish to provide feedback, or hit us up on Twitter @DataMiningApps https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvdHdpdHRlci5jb21cL0RhdGFNaW5pbmdBcHBzIn0/. Feel free to pass on this newsletter to friends or colleagues; they can subscribe for free as well through our subscribe page https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwOlwvXC93d3cuZGF0YW1pbmluZ2FwcHMuY29tXC9kYXRhbWluaW5nYXBwcy1uZXdzbGV0dGVyXC8ifQ/. We also keep an online record of our feature articles, QA’s, and web picks over at www.dataminingapps.com https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwOlwvXC93d3cuZGF0YW1pbmluZ2FwcHMuY29tXC9jYXRlZ29yeVwvYXJ0aWNsZXNcLyJ9/, too, in case you want to catch up on previous issues.

Kindest regards, Prof. dr. Bart Baesens https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwOlwvXC93d3cuZGF0YW1pbmluZ2FwcHMuY29tXC9kbWFfc3RhZmZcL2JhcnQtYmFlc2Vuc1wvIn0/ Prof. dr. Seppe vanden Broucke https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwOlwvXC93d3cuZGF0YW1pbmluZ2FwcHMuY29tXC9kbWFfc3RhZmZcL3NlcHBlLXZhbmRlbi1icm91Y2tlXC8ifQ/ RFM Analysis Revisited

Contributed by: Bart Baesens https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmRhdGFtaW5pbmdhcHBzLmNvbVwvZG1hX3N0YWZmXC9iYXJ0LWJhZXNlbnNcLyJ9/, Seppe vanden Broucke https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmRhdGFtaW5pbmdhcHBzLmNvbVwvZG1hX3N0YWZmXC9zZXBwZS12YW5kZW4tYnJvdWNrZVwvIn0/

This article is based on our BlueCourses course Customer Lifetime Value Modeling https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmJsdWVjb3Vyc2VzLmNvbVwvY291cnNlc1wvY291cnNlLXYxOmJsdWVjb3Vyc2VzJTJCQkM4JTJCMjAyMF9RMlwvYWJvdXQifQ/ .

Although RFM analysis is sometimes referred to as a poor man’s approach to CLV analysis, we think it’s a very good way to start doing customer lifetime (CLV) modeling. As always in machine learning, it’s not because it’s simple, that it is necessarily bad. The RFM framework has already been popular since its introduction by Cullinan in 1977. It’s a well-known and well-developed measurement framework used in marketing across different industries such as banking, insurance, Telco, non-profit, travel, on-line retailers, and even government. It consists of a set of metrics to monitor customers’ behaviour so as to develop suitable customer relationship management or CRM strategies. Do note that the RFM framework focusses on existing customers, instead of prospects.

Basically, the RFM famework summarizes the purchasing habits of consumers according to 3 dimensions only. It essentially builds upon the Pareto principle which states “For many events, roughly 80% of the effects come from 20% of the causes”. In our case, the events correspond to purchases and the causes to customers. Or, translated to an RFM setting: 20% of your customers are likely to generate 80% of your profit. Using the RFM framework, we will try to find out which are those customers who buy or bought recently, frequently and for high monetary values?

Recency measures the time since the most recent purchase transaction. Frequency measures the total number of purchase transactions in the period examined. And finally, monetary measures the value of purchases within the period examined. The combination of these three variables provides a very useful perspective on the value of your current customer portfolio.

Let’s now zoom in on each of the variables of the RFM framework into some more detail.

We start with recency. As with any RFM variable, it can be operationalized in various ways. Examples are: how long ago since the customer made a purchase? This results into a continuous variable. As an alternative, we can measure it in a binary way as: did the customer make a purchase during the previous day, week, month or year? Finally, we can also define it in an exponential way. More specifically, we can define recency as e^(-γt). Here t is the time-interval between two consecutive purchases. γ is a user-specified parameter which is typically rather small, for example 0.02. Note that by using this procedure, recency is always a number between 0 and

  1. The figure below shows that recency indeed decreases when the time-interval gets bigger.

https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmRhdGFtaW5pbmdhcHBzLmNvbVwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyMFwvMDVcL3JlY2VuY3kxLnBuZyJ9/

The parameter γ determines how fast the recency decreases. For larger values of γ, recency will decrease quicker with time and vice versa. But how do you choose γ, you might ask? Well, you could choose γ such that recency has to be equal to 0.01 after 180 days for example. Then γ is -log(0.01)/180.

Frequency is the second variable of the RFM framework. As said, it measures how frequently the customer buys. A first way of measuring it is by calculating the average number of purchases per unit of time, such as per month over the last year. Some frequency calculations will also take the tenure or lifetime of the customer into account and measure it as the total number of purchases divided by the number of months since the first purchase. According to most research, including the research conducted by myself, the frequency variable is usually the most important of the RFM framework.

Also the monetary variable can be operationalized in various ways. It can be calculated as the average, maximum or minimum purchase value during the past year. It can also be measured as the most recent value of a purchase. Or, we can again take into account the tenure of the customer and consider the total lifetime spending. Trends can also be looked at. These features usually turn out to be very predictive in any analytical CLV setting. Trends summarize the historical evolution of a variable in various ways. Trends can be computed in an absolute or relative way as follows:

When computing trends, it is important to consider what happens if the denominator becomes 0. Recent values can also be assigned a higher weighted. Trends can also be featurized using time series techniques, such as ARIMA or GARCH models.

Interactions between the RFM variables can also be taken into account. These can be 2-way interactions, 3-way interactions, etc. Obviously, the thing with interactions is that they usually make a predictive model more difficult to interpret. Hence, be very careful when considering them. My practical advice is to only include them when they really add to the predictive performance of your analytical model. Typically, you will also observe correlations between the R, F and M variables. A commonly observed correlation is the one between the frequency and monetary variables. This correlation is not necessarily problematic, but being aware of it is already very important.

We are now ready to start operationalizing the RFM variables such that we can work with them in a meaningful way. The idea here is to create an RFM score which can then be used for customer segmentation, churn prediction or any other CLV related analytical modeling task. Obviously, due to continuously changing customer behavior and the external environment, this RFM score should be updated regularly. Creating an RFM score requires a combination of both analytical skills and business experience. Let’s elaborate a bit further on this.

Here you can see a very simple example of creating an RFM score.

https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmRhdGFtaW5pbmdhcHBzLmNvbVwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyMFwvMDVcL3JmbTEuanBnIn0/

We basically created 3 bins for each of the RFM variables. The bins are ordinally ordered. Let’s quickly look at some of them. For Frequency, bin 1 contains all customers who did at most 1 transaction during the previous month, bin 2 the customers who did 2 to 5 transactions and bin 3 the customers who did 6 or more transactions. The other variables are binned in a similar way. Let’s say we have a customer who belongs to bin 2 for recency, to bin 1 for frequency and to bin 2 for monetary. We can then summarize this into an RFM score of 2 + 1 + 2 or 5. As said, this procedure can then be used for customer segmentation or to create variables for analytical models.

https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmRhdGFtaW5pbmdhcHBzLmNvbVwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyMFwvMDVcL3JmbTIuanBnIn0/

A common way of creating RFM bins is by creating quintiles. This can be done using either independent or dependent sorting. Let’s start with independent sorting. In this case, we sort the data by recency and create 5 quintiles which are labelled as R1, R2 until R5. The quintile R1 then represents the 20% most ancient buyers. We then sort by Frequency and also create 5 quintiles. F1 represents the customers that buy least frequently. Finally, we do the same for the monetary variable, sort and create 5 quintiles: M1 until M5 with M1 representing the lowest average spenders. The final RFM score can then be used as a cluster indicator or even as a predictor for an analytical model. Note that the best customers are typically assumed to be in quintile 5 for each RFM variable or cluster

  1. These represent the customers that have purchased most recently, most frequently and have spent the most money.

Dependent sorting works in a strict sequential way. It starts with the Recency variable first and creates the 5 quintiles. Each Recency quintile is then further binned into 5 Frequency quintiles. Each resulting RF bin is then further binned into 5 quintiles based on the Monetary variable. As with independent sorting, the final RFM score can be used as a cluster indicator or variable in a predictive analytical CLV model. Note that scientifically, to the best of my knowledge, it is not possible to state which one is best: independent or dependent sorting.

https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmRhdGFtaW5pbmdhcHBzLmNvbVwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyMFwvMDVcL3JmbTMuanBnIn0/

The RFM variables can be used as input variables for various analytical CLV models such as churn prediction, response modeling, customer segmentation and obviously also CLV analytical models. The bottom table illustrates a churn prediction data set which combines the RFM variables with other customer specific information such as age, marital status, etc.

https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LmRhdGFtaW5pbmdhcHBzLmNvbVwvd3AtY29udGVudFwvdXBsb2Fkc1wvMjAyMFwvMDVcL3JmbTQuanBnIn0/

Let’s now zoom out of the original marketing context and do some out of the box thinking. Essentially, recency quantifies the recency of an event, frequency the frequency of events and monetary the impact, intensity or reach of an event. Defining the RFM variables in this more general way, opens up perspectives for their use in other settings. The RFM variables are commonly used in fraud analytics. Think about credit card fraud as an example. Here, R can refer to the recency of a transaction, F to the frequency and M to the monetary value. In web analytics, the recency can represent the recency of a web site visit, the frequency the frequency thereof and the monetary variable can represent the duration of the visit. In a social media setting, we can look at the recency of a post, the frequency of posts and community size that is reached with the post, such as followers, retweets, shares, etc.

Let’s conclude the discussion of the RFM framework with some closing thoughts. A key advantage of the RFM framework is that it is simple and easy to understand and calculate. It provides a compact and powerful representation of customer behavior. We consider it to be an ideal approach to build your first CLV models. Remember, when doing analytics it is always wise to start off simple and then gradually sophisticate your models.

Do you also wish to contribute to Data Science Briefings? Shoot us an e-mail (just reply to this one) and let’s get in touch! Our friends at SAS are organizing two exciting online events:

Your Reading List: Our Web Picks

DataMiningApps - www.dataminingapps.com https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwOlwvXC93d3cuZGF0YW1pbmluZ2FwcHMuY29tIn0/ KU Leuven Department of Decision Sciences and Information Management Naamsestraat 69 3000 Leuven, Belgium

This email was sent by DataMiningApps because you subscribed to our newsletter. If you wish to unsubscribe, you can do so by using the link below.

Unsubscribe from this list. https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiIlN0JzcC11bnN1YnNjcmliZS11cmwlN0QifQ/

[image: 500px] https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwOlwvXC93d3cuZGF0YW1pbmluZ2FwcHMuY29tIn0/ [image: Facebook] https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwOlwvXC93d3cuZmFjZWJvb2suY29tXC9ncm91cHNcLzE5MzkyNzY3NzE4OVwvIn0/ [image: Twitter] https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvdHdpdHRlci5jb21cL0RhdGFNaW5pbmdBcHBzIn0/ [image: YouTube] https://www.dataminingapps.com/sendpress/eyJpZCI6IjQzNiIsInJlcG9ydCI6IjYyOTE2IiwidmlldyI6InRyYWNrZXIiLCJ1cmwiOiJodHRwczpcL1wvd3d3LnlvdXR1YmUuY29tXC91c2VyXC9kYXRhbWluaW5nYXBwcyJ9/

-- Sven Van Poucke, MD, PhD Ziekenhuis Oost-Limburg Schiepse Bos 6 3600 Genk [image: LinkedIn] http://www.linkedin.com/pub/sven-van-poucke/60/847/aa2 [image: Share on ResearchGATE] https://www.researchgate.net/profile/Sven_Van_Poucke/ [image: https://publons.com/researcher/445759/sven-van-poucke/] https://publons.com/researcher/445759/sven-van-poucke/