InseadDataAnalytics / INSEADAnalytics

Other
122 stars 1.31k forks source link

problem with data quality when converting to time series #87

Open jerepow opened 6 years ago

jerepow commented 6 years ago

Hi Guys,

Have a data read issue that I can't seem to solve:

read in file - this works just fine to show the table as I'd like it to

users <- read.csv(file="c:/Users/jerem/Documents/Uni/P3/DS for Bus/Yahoos Acquisition of Tumblr Data.csv", header=TRUE, sep=",")

determine series -this reads like the pic that does not reflect the right data

tumblr_ts <- ts(tumblr.data$People,start=c(2010,4), frequency=12)

Any ideas? I've tried removing everything else except 'People' vector from the .csv, selecting the column using square brackets

image

VarunKShetty commented 6 years ago

Hey @jerepow, can you share the whole script file and point to the line numbers that are causing the problem? I can see the first chunk of code in your comment is reading a csv and assigning it to "users", but the tumbler_ts is coming from a different data frame called "tumblr.data" -- I can't see how that was generated.

jerepow commented 6 years ago

Thanks @VarunKShetty, tumblr.data <- users. No manipulation steps made in between

VarunKShetty commented 6 years ago

Hi @jerepow, I am afraid I am still unable to wrap my head around the issue. Could you elaborate. And also share your full script with me? -VK

arthurcurrie commented 6 years ago

Jeremy I have the exact same issue.. one thing I noticed is that the values in the 'observed' are the numbers of data points (1-38). Wonder if it's somehow picking up the order as the value, instead of the People value?

arthurcurrie commented 6 years ago

is there some specific way that we need to transfer the Excel data into CSV format? I've just been copy-pasting and saving as CSV

arthurcurrie commented 6 years ago

ok i found a workaround (though god knows why it works)

rather than saving as CSV in excel, open up the electric rates csv data file and overwrite it with your data. Save that, then import it, and it looks fine

marionroger commented 6 years ago

Hi @jerepow and @arthurcurrie We had the same issue within our group. It's coming from the format of the numbers in the file. We managed to remove the issue by copying pasting as value in a new file to remove the weird comma format Hope this helps!

carlocelis commented 6 years ago

I also experienced this. Adding to marion's comment, we need to change the number format from "Number" to "General" to remove the commas every 000's

image