davidcarslaw / openair

Tools for air quality data analysis
https://davidcarslaw.github.io/openair/
GNU General Public License v2.0
303 stars 113 forks source link

Missing dates detected, removing lines #168

Closed Jaip2018 closed 1 year ago

Jaip2018 commented 6 years ago

my data is time series of Ozone concentration every minute data of year 2017. I would like to average in hour, ~ 22345 data missing due to missing. O3_TS.xlsx please let me know how to solve this.

getwd() setwd("C:/Users/IGP CARE/Documents") JP_O3 <- read.csv("O3_TS.csv") library(readr) View(JP_O3) library(openair) class(JP_O3) colnames(JP_O3) colnames(JP_O3$Date) colnames(JP_O3) JP_O3$Date <- as.POSIXct(JP_O3$Date) JP_O3$Date <- strptime(JP_O3$Date,format="%d/%m/%Y %H:%M") library(openair) timeAverage(JP_O3,avg.time = "hour")

masabhathini commented 6 years ago

Hi,

you can try this,

library("openair") sat <- read.csv(file="O3_TS.csv",header=TRUE,sep=",") sat$date <- as.POSIXct(strptime(sat$Date, format="%m/%d/%Y %H:%M",tz="GMT")) sat <- subset(sat,O3 > 0 ) ss <- timeAverage(sat,avg.time="hour") write.csv(ss,file="hourAveO3_TS.csv",row.names=FALSE) quit()

regards, Sateesh

Jaip2018 commented 6 years ago

Dear Sateesh,

When I m writing code ss <- timeAverage(sat,avg.time="hour"), rest above code is fine with me. ther is some error

Error in checkPrep(mydata, vars, type = "default", remove.calm = FALSE, :

On Mon, Sep 17, 2018 at 11:44 AM sateesh notifications@github.com wrote:

Hi,

you can try this,

library("openair") sat <- read.csv(file="O3_TS.csv",header=TRUE,sep=",") sat$date <- as.POSIXct(strptime(sat$Date, format="%m/%d/%Y %H:%M",tz="GMT")) sat <- subset(sat,O3 > 0 ) ss <- timeAverage(sat,avg.time="hour") write.csv(ss,file="hourAveO3_TS.csv",row.names=FALSE) quit()

regards, Sateesh

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/davidcarslaw/openair/issues/168#issuecomment-421898911, or mute the thread https://github.com/notifications/unsubscribe-auth/AolEL8zmK7H2WjH7HTDys7sM4hTPlIJ6ks5ubz3hgaJpZM4WrArJ .

-- Thank you With regards

Dr. Jai Prakash

Resident Scientist IGP-CARE, Hamirpur (U.P.), a research station of University of Gothenburg, Sweden Mob.-+91-9818607154 ResearchGate: https://www.researchgate.net/profile/Jaiprakash https://www.researchgate.net/profile/Jaiprakash Google Scholar: https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en

-------------------------------------------------------------------------------------------------------------------------

davidcarslaw commented 6 years ago

If you can send the data to me directly, I can take a look.

masabhathini commented 6 years ago

Dear Jai Prakash,

Here, I attached the result file also. hourAveO3_TS.csv.TXT O3_TS.csv.TXT

`library("openair")

sat <- read.csv(file="O3_TS.csv.TXT",header=TRUE,sep=",") sat$date <- as.POSIXct(strptime(sat$Date, format="%m/%d/%Y %H:%M",tz="GMT")) sat <- subset(sat,O3 > 0 ) ss <- timeAverage(sat,avg.time="hour") write.csv(ss,file="hourAveO3_TS.csv.TXT",row.names=FALSE) quit()`

Jaip2018 commented 6 years ago

Dear Sateesh

Thanks previous code was right. actually, I was doing wrong.

Please let me know O3_TS.csv should be openair library or we can put the data in c drive and keep getwd(), and setwd()

I would like to plot the primary axis in minute data and hourly data in the secondary axis. Is there the option in timeplot in open-air library

On Mon, Sep 17, 2018 at 4:32 PM sateesh notifications@github.com wrote:

Dear Jai Prakash,

Here, I attached the result file also. hourAveO3_TS.csv.TXT https://github.com/davidcarslaw/openair/files/2388245/hourAveO3_TS.csv.TXT O3_TS.csv.TXT https://github.com/davidcarslaw/openair/files/2388246/O3_TS.csv.TXT

`library("openair")

sat <- read.csv(file="O3_TS.csv.TXT",header=TRUE,sep=",") sat$date <- as.POSIXct(strptime(sat$Date, format="%m/%d/%Y %H:%M",tz="GMT")) sat <- subset(sat,O3 > 0 ) ss <- timeAverage(sat,avg.time="hour") write.csv(ss,file="hourAveO3_TS.csv.TXT",row.names=FALSE) quit()`

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/davidcarslaw/openair/issues/168#issuecomment-421969443, or mute the thread https://github.com/notifications/unsubscribe-auth/AolEL_vMbyo9dyfHXoeEcTi3Xukrhgjdks5ub4FZgaJpZM4WrArJ .

-- Thank you With regards

Dr. Jai Prakash

Resident Scientist IGP-CARE, Hamirpur (U.P.), a research station of University of Gothenburg, Sweden Mob.-+91-9818607154 ResearchGate: https://www.researchgate.net/profile/Jaiprakash https://www.researchgate.net/profile/Jaiprakash Google Scholar: https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en

-------------------------------------------------------------------------------------------------------------------------

masabhathini commented 6 years ago

Dear Jai Prakash, This O3_TS.csv is your measurement data, you can put any where in your local PC. And add the full path to the code. Coming to your question,timePlot can plot hourly, minute time series data individually. Openair has lot of functions for air quality analysis. A special thanks to David. ls("package:openair")

cheers :)

Jaip2018 commented 6 years ago

Dear Sateesh the previous file, there were some O3 value > 1000 so that I remove it. Now after running the same code in R. There is error O3> 0. ???

masabhathini commented 6 years ago

Dear Jai,

You can apply condition like this.

``library("openair")

sat <- read.csv(file="O3_TS.csv.TXT",header=TRUE,sep=",") sat$date <- as.POSIXct(strptime(sat$Date, format="%m/%d/%Y %H:%M",tz="GMT")) sat <- subset(sat,O3 > 0 & O3 <1000) ss <- timeAverage(sat,avg.time="hour") write.csv(ss,file="hourAveO3_TS.csv.TXT",row.names=FALSE) quit()``

regards,

Jaip2018 commented 6 years ago

Thanks a lot.

On Mon, Sep 17, 2018 at 10:42 PM sateesh notifications@github.com wrote:

Dear Jai,

You can apply condition like this.

``library("openair")

sat <- read.csv(file="O3_TS.csv.TXT",header=TRUE,sep=",") sat$date <- as.POSIXct(strptime(sat$Date, format="%m/%d/%Y %H:%M",tz="GMT")) sat <- subset(sat,O3 > 0 & O3 <1000) ss <- timeAverage(sat,avg.time="hour") write.csv(ss,file="hourAveO3_TS.csv.TXT",row.names=FALSE) quit()``

regards,

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/davidcarslaw/openair/issues/168#issuecomment-422096456, or mute the thread https://github.com/notifications/unsubscribe-auth/AolEL-nTnsQJ8o7lecpTrkN-RrGaESvoks5ub9f6gaJpZM4WrArJ .

-- Thank you With regards

Dr. Jai Prakash

Resident Scientist IGP-CARE, Hamirpur (U.P.), a research station of University of Gothenburg, Sweden Mob.-+91-9818607154 ResearchGate: https://www.researchgate.net/profile/Jaiprakash https://www.researchgate.net/profile/Jaiprakash Google Scholar: https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en

-------------------------------------------------------------------------------------------------------------------------

Jaip2018 commented 5 years ago

Hi Dawid and Sateesh I am trying to plot trajCluster with type season as follows. trajCluster(traj, method = "Euclid", n.cluster = 4, plot = TRUE, type = "season", cols = c("yellow", "green", "blue", "red"), split.after = FALSE, map.fill = FALSE, map.cols = "transparent", map.alpha = 1, projection = "mercator", parameters = NULL, orientation = c(90,0,0), by.type = TRUE, origin = TRUE, lwd=4, grid.col="grey", grid.alpha=0.4, font.label= c(12, "bold", "red"), par.settings=list(fontsize=list(text=16)))

I am not getting as above plot which you plotted. Please let me know what should do.

On Tue, Sep 18, 2018 at 2:15 AM jai prakash jaipism@gmail.com wrote:

Thanks a lot.

On Mon, Sep 17, 2018 at 10:42 PM sateesh notifications@github.com wrote:

Dear Jai,

You can apply condition like this.

``library("openair")

sat <- read.csv(file="O3_TS.csv.TXT",header=TRUE,sep=",") sat$date <- as.POSIXct(strptime(sat$Date, format="%m/%d/%Y %H:%M",tz="GMT")) sat <- subset(sat,O3 > 0 & O3 <1000) ss <- timeAverage(sat,avg.time="hour") write.csv(ss,file="hourAveO3_TS.csv.TXT",row.names=FALSE) quit()``

regards,

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/davidcarslaw/openair/issues/168#issuecomment-422096456, or mute the thread https://github.com/notifications/unsubscribe-auth/AolEL-nTnsQJ8o7lecpTrkN-RrGaESvoks5ub9f6gaJpZM4WrArJ .

-- Thank you With regards


Dr. Jai Prakash

Resident Scientist IGP-CARE, Hamirpur (U.P.), a research station of University of Gothenburg, Sweden Mob.-+91-9818607154 ResearchGate: https://www.researchgate.net/profile/Jaiprakash https://www.researchgate.net/profile/Jaiprakash Google Scholar: https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en

-------------------------------------------------------------------------------------------------------------------------

-- Thank you With regards

Dr. Jai Prakash

Resident Scientist IGP-CARE, Hamirpur (U.P.), a research station of University of Gothenburg, Sweden Mob.-+91-9818607154 ResearchGate: https://www.researchgate.net/profile/Jaiprakash https://www.researchgate.net/profile/Jaiprakash Google Scholar: https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en https://scholar.google.co.in/citations?user=aXCi_O8AAAAJ&hl=en

-------------------------------------------------------------------------------------------------------------------------

Ayanda91 commented 4 years ago

Dear Sateesh,

I am struggling with RStudio, openair package. I am working with SO2 and PM10 data and I would like to generate polar plots, timetime plots, pollution roses, windroses etc. Whenever i input my data sets I cannot generate any graphs and it gives me various errors like "some missing dates on lines. Could you kindly assist. I have attached the datasets that I have been using.

Thank you.

Mamelodi 2011-2012 StationsReport.xlsx Mamelodi 2012-2013 StationsReport.xlsx Mamelodi 2013-2014 StationsReport.xlsx Mamelodi 2014-2015 StationsReport.xlsx Mamelodi 2009-2010 StationsReport.xlsx Mamelodi 2015-2016 StationsReport.xlsx Mamelodi 2016-2017 StationReport.xlsx Mamelodi 2017-2018 StationsReport.xlsx Mamelodi 2018-2019 StationsReport.xlsx Mamelodi 2010-2011 StationsReport.xlsx

masabhathini commented 4 years ago

Dear Ayanda,

Your data files contains some unwanted rows at this moment. I have noticed that rows 1, 2, 4 and last few rows has to be removed. My advice is convert your data files into csv files & remove the suggested rows. Here is my code.... (May be useful). ########################## library(openair) sat_df = data.frame() listing = list.files('./',pattern='Mamelodi') header =c('DateTime','SO2','NO2','NO','NOX','O3','CO','PM10','Benzene','Toluen','o-Xylene','WS','WS StdDev','WD','WD StdDev','Amb Temp','Rel Hum','Solar Rad','Amb Pressure','Rain','Int Temp') for (filename in listing) { print(filename) sat <- read.csv(filename,header=F,sep=',',na.strings='----') df <- data.frame(sat) print(dim(df)) sat_df <- rbind(sat_df,df) } names(sat_df)=header sat_df$date <- as.POSIXct(sat_df$DateTime, format="%d/%m/%Y %H:%M") #################### summaryPlot(sat_df) ############### image

regards, M. Sateesh

Ayanda91 commented 4 years ago

Good Day Sateesh,

Thank you for your assistance.

Kind regards.

On Tue, 30 Jun 2020 at 16:33, sateesh notifications@github.com wrote:

Dear Ayanda,

Your data files contains some unwanted rows at this moment. I have noticed that rows 1, 2, 4 and last few rows has to be removed. My advice is convert your data files into csv files & remove the suggested rows. Here is my code.... (May be useful). ########################## library(openair) sat_df = data.frame() listing = list.files('./',pattern='Mamelodi') header =c('DateTime','SO2','NO2','NO','NOX','O3','CO','PM10','Benzene','Toluen','o-Xylene','WS','WS StdDev','WD','WD StdDev','Amb Temp','Rel Hum','Solar Rad','Amb Pressure','Rain','Int Temp') for (filename in listing) { print(filename) sat <- read.csv(filename,header=F,sep=',',na.strings='----') df <- data.frame(sat) print(dim(df)) sat_df <- rbind(sat_df,df) } names(sat_df)=header sat_df$date <- as.POSIXct(sat_df$DateTime, format="%d/%m/%Y %H:%M") #################### summaryPlot(sat_df) ############### [image: image] https://user-images.githubusercontent.com/25902165/86138562-43388280-bb0c-11ea-9f16-3fc95dbe0849.png

regards, M. Sateesh

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidcarslaw/openair/issues/168#issuecomment-651832844, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQEBGBSDHK23OSR66FK4Z23RZHZVNANCNFSM4FVMBLEQ .

Ayanda91 commented 4 years ago

Good Day Sateesh,

Is it possible for me to contact you privately, I am encountering a string of issues and I believe you might be able to assist.

Thanks.

jack-davison commented 1 year ago

Closing stale issue, with note that it is possible to read in data from Excel spreadsheets without going via csv files!

library(readxl)
library(dplyr)
library(tidyr)

path_to_file <- "Mamelodi.2011-2012.StationsReport.xlsx"

# get meta
meta <- 
  read_excel(path_to_file, n_max = 1, col_names = "meta") %>%
  mutate(meta = stringr::str_remove_all(meta, c("Station Name: |Type: |TimeBase: |Yearly: "))) %>%
  separate(meta, sep = ", ", into = c("station_name", "type", "timebase", "yearly"))

# get column headers
headers <- names(read_excel(path_to_file, skip = 2, n_max = 1))

# import data
data <-
  read_excel(
    path_to_file,
    skip = 4,
    na = c("----"),
    col_names = headers
  ) %>%
  rename(date = DateTime) %>%
  mutate(date = lubridate::dmy_hm(date)) %>%
  mutate(meta)

More assistance for some of these more R/dplyr/tidyr-type approaches may be found in the Posit Community Forum or in R for Data Science.