okfn-brasil / serenata-notebooks

Notebooks from Operação Serenata de Amor | ** Este repositório não recebe atualizações frequentes **
MIT License
53 stars 12 forks source link

Twitter API suspension #26

Open g4brielvs opened 4 years ago

g4brielvs commented 4 years ago

What is the purpose of this Pull Request?

This is an analysis (take 1) to start the conversation to understand how Twitter API suspension might have impacted Rosie's level of engagement.

What was done to achieve this purpose?

I used time series analysis, particularly an autoregressive model.

How to test if it really works?

An overview of the methodology would be a good start.

Who can help reviewing it?

@cuducos @jtemporal

TODO

jtemporal commented 4 years ago

Question: Can we reliably use the Quadratic regression on the before 2018 data? I ask this because we have a huge gap in information (which I think is due to Rosie's sabbatical).

Other than the data itself, I wonder if this is leading to an inconclusive result since the linear regression shows one thing and the quadratic another. Am I missing some mathematical/statistical concept here?

g4brielvs commented 4 years ago

@jtemporal thank you for the feedback. I am with you. I pointed out that a polynomial regression might not be the best approach here, specially because we have a reason to think that that time series is not a stationary process. That is why I used an autoregressive integrated moving average (ARIMA) model instead.

Have you had the chance to look at my notebook?

cuducos commented 4 years ago

Can we reliably use the Quadratic regression on the before 2018 data?

Probably not, but that was may naïve approach just to get started. As the mathematician who really adds values in the analysis is @g4brielvs, what about git rm my notebooks (which were merely warmups for his analysis)? We can also checkout to my commits to see what I've tried.

jtemporal commented 4 years ago

hi @g4brielvs Just started looking at yourt notebook. Bellow I'll write down some changes I think would be good to have:

That is why I used an autoregressive integrated moving average (ARIMA) model instead.

I like that <3 I think is a better approach to the matter at hand

Note that the negative trend apparently started before and has been accentuated after the block. ... Between Fev/2019 and Apr/2019 - right after the block - the slope has higher negative value and continuously stabilizes, but in a negative trend.

<3 null hypothesis validated: block = bad

g4brielvs commented 4 years ago

@jtemporal thank you! I made those changes

g4brielvs commented 3 years ago

@jtemporal Hey! I just wanted to check if this PR is still relevant. If more changes are needed, I'd be happy to work on those.

jtemporal commented 3 years ago

Hi @g4brielvs I think we need to check with @sergiomario on this 😉