jasp-stats / jasp-issues

This repository is solely meant for reporting of bugs, feature requests and other issues in JASP.
58 stars 29 forks source link

[Bug]: Time Series Analysis Stationarity tests #2803

Open wooyoung32 opened 3 months ago

wooyoung32 commented 3 months ago

JASP Version

0.18.3

Commit ID

No response

JASP Module

Time Series

What analysis are you seeing the problem on?

Stationarity tests

What OS are you seeing the problem on?

macOS Intel

Bug Description

The results of the stationarity test from JASP appear to differ from the results obtained from Python.

Screenshot 2024-06-21 at 1 59 37 PM

Expected Behaviour

Screenshot 2024-06-21 at 2 19 12 PM Screenshot 2024-06-21 at 2 19 19 PM Screenshot 2024-06-21 at 2 19 26 PM

These are the results from the Python with the same data set.

Steps to Reproduce

  1. Go to Time Series tap
  2. Run Stationarity test Pennsylvania monthly offences data_2013.csv

...

Log (if any)

No response

Final Checklist

tomtomme commented 3 months ago

@wooyoung32 thx for the report. I can confirm the results you got with jasp for version 0.18.3 and current 0.19 beta. The python results may differ because of different "truncation lag parameter"? For JASP this is between 5 and 4, while the pythons output shows "lags" of 6 to 13? Just an observation. I have no clue regarding this analyis and maybe truncation lag and lag is something completely different.

But if I am correct, then please do not be offended by the following question: In JASP I can alter the truncation lag by filtering the data (shorter data = shorter truncation lag). Are you sure the data processed by python was not a longer time span then?

wooyoung32 commented 2 months ago

@tomtomme I've been using the same dataset for both JASP and Python, but I've noticed that the two programs display different test statistics and p-values for all three tests.

When using Python, "lags" refers to the number of lags needed to achieve stationarity through differencing. However, I'm not sure if "lags" means the same thing in JASP, as I couldn't find a detailed explanation in the program. If it has a different meaning in JASP, I would appreciate it if you could clarify it for me.

tomtomme commented 2 months ago

@sophieberkhout Can you clarify the meaning of "lags" in JASP and how the different results may have occured?

sophieberkhout commented 2 months ago

It seems that R and Python use different default arguments for these functions.

For example, for the ADF test, JASP uses the R function adf.test from the tseries package. The truncation lag parameter represents the number of lags used in the regression and the default value is computed from a suggested upper bound based on the length of the series; you can find the documentation here: https://www.rdocumentation.org/packages/tseries/versions/0.10-56/topics/adf.test.

Python uses a different method to obtain the number of lags. Python should provide similar results once you set the maxlags argument to 5, the autolags argument to None, and the regression argument to "ct" (see also the answer in this discussion: https://stackoverflow.com/questions/78177755/why-augmented-dickey-fuller-results-are-different-in-r-and-python).

I could add an option to manually pick the truncation lag parameter, although I am not sure how desirable that is as both packages seem to use an algorithm for this.

EJWagenmakers commented 2 months ago

Would be good to add this information to the help file...

wooyoung32 commented 1 month ago

@sophieberkhout Not only the number of lags but also JASP and Python gave me completely different results for the test results (p-values, test statistics). Even though the graph of the data shows that it's not stationary, JASP's results indicate that it is stationary. Could you please take a look at this? Thanks!