Open idontgetoutmuch opened 6 years ago
I have downloaded the historical dataset for old faithful: http://www.geysertimes.org/archive/geysers/Old_Faithful_eruptions.tsv.gz. I am struggling to understand what some of the columns mean.
In particular, looking at the first row,
eruptionID geyser eruption_time_epoch has_seconds exact ns ie E A wc ini maj min q duration entrant observer eruption_comment time_updated time_entered associated_primaryID other_comments Old_Faithful_Preplay_Time_VEC Old_Faithful_Height_VEC 23132 Old Faithful 10506540 0 1 0 0 0 0 0 0 1 0 0 4min BoekelUpload OFVCL-EV 1335129843 1335129843 23132 NULL NULL NULL
did the dataset really begin at (eruption_time_epoch):
eruption_time_epoch
*Main> epochToUTC 10506540 1970-05-02 14:29:00 UTC
and if so what do the time_updated and and time_entered mean?
time_updated
time_entered
*Main> epochToUTC 1335129843 2012-04-22 21:24:03 UTC
Perhaps the data was collected in 1970 but only added to your excellent site in 2012?
By row 86360 the consistency(?) seems to have improved
86360 Old Faithful 1310155500 0 1 0 0 0 0 0 0 0 1 0 1m46s BoekelUpload OFVCL-EV (160+ft) 1352243080 1352243080 86360 NULL NULL NULL
So the eruption_time_epoch is
*Main> epochToUTC 1310155500 2011-07-08 20:05:00 UTC
and the time_updated and time_entered are
*Main> epochToUTC 1352243080 2012-11-06 23:04:40 UTC
Also the number of missing entries for duration seem to have increased over time. Is there any reason for this?
duration
I am trying to validate the dataset that is available in the R programming language https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/faithful.html which I am beginning to suspect is not representative of old faithful's actual behaviour.
In case you are interested in what the R dataset looks like
The x-axis is duration and the y-axis is gap between eruptions.
I have downloaded the historical dataset for old faithful: http://www.geysertimes.org/archive/geysers/Old_Faithful_eruptions.tsv.gz. I am struggling to understand what some of the columns mean.
In particular, looking at the first row,
did the dataset really begin at (
eruption_time_epoch
):and if so what do the
time_updated
and andtime_entered
mean?Perhaps the data was collected in 1970 but only added to your excellent site in 2012?
By row 86360 the consistency(?) seems to have improved
So the
eruption_time_epoch
isand the
time_updated
andtime_entered
areAlso the number of missing entries for
duration
seem to have increased over time. Is there any reason for this?I am trying to validate the dataset that is available in the R programming language https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/faithful.html which I am beginning to suspect is not representative of old faithful's actual behaviour.