davidcarslaw / openair

Tools for air quality data analysis
https://davidcarslaw.github.io/openair/
GNU General Public License v2.0
305 stars 113 forks source link

Theil-Sen Trend Analysis #378

Open morshedahmed17 opened 9 months ago

morshedahmed17 commented 9 months ago

Question

I removed some months from the dataset for the Theil-Sen trend analysis, which I do not want to show in the plot. However, those points still appear in the plots, not exactly where they were before. Is it possible to remove them completely? Also, is it possible to remove the line joining the points?

jack-davison commented 9 months ago

Hi,

We'd be able to best help you if you can provide a reproducible example.

TheilSen() should only plot the data you use as an input, e.g.,:

library(openair)

alldata <- mydata

TheilSen(alldata, "nox")
#> Taking bootstrap samples. Please wait.


lessdata <- selectByDate(mydata, year = 2000:2002)

TheilSen(lessdata, "nox")
#> Taking bootstrap samples. Please wait.

Created on 2024-02-21 with reprex v2.0.2

More than happy to help if you can provide more info!

Jack

morshedahmed17 commented 9 months ago

Hello Jack,

Here is an example. I removed the entire 2012 ethane data. However, the monthly values are still showing in the plot.

image

mooibroekd commented 9 months ago

When using the "deseason" option missing data is actually imputed using a Kalman filter and a Kalman smooth prior to doing the loess.

In other words, that is the probably the reason why the missing data is there. Quickest way to test is by setting "deseason" to FALSE.

morshedahmed17 commented 8 months ago

You are right. Just one quick question.

"If I set 'deaseason = FALSE', will the resulting trendline simply be a regression line?"

mooibroekd commented 7 months ago

No, the output will still be a Theil-Sen analysis, but on data that has not been deseasonalized (meaning actual measurements).