Open jjturner opened 2 years ago
I think something went off the rails a little bit with 1.1. I might have to revert.
I agree with the above counts (I also get 440510) but think this is more than just a row count discrepancy.
If you follow links in the pdf: Ring Dust | Calculating Cassini's Speed and then page down 3 pages you will see some SQL:
select time_stamp,
x_velocity,
y_velocity,
z_velocity,
sqrt(
(x_velocity * x_velocity) +
(y_velocity * y_velocity) +
(z_velocity * z_velocity)
)::numeric(10,2) as v_kms
from cda.impacts
where x_velocity <> -99.99;
Which produces data show on the next page of the pdf, for which the first two rows have the following timestamps:
2005-04-04 18:12:57.6-07
2005-04-04 20:32:38.4-07
I don't believe I have this data in my file. I've downloaded the data several times by following the Archives for the Cassini Mission link at the red4 archive both on Windows and Mac. And I've unzipped it with several utilities.
The cda.csv
file has 440510 data rows and that is the count that ends up in my import.cda
and cda.impacts
file.
By my calculation, a timestamp with a date of 2005-04-04
should have an impact_event_time
in cda.csv
that begins with 2005-094
but there is no such text in cda.csv
.
The earliest data I can find in my downloaded cda.csv
is for 2005-01-01
and if I run the following SQL:
with t1 as (
select time_stamp,
x_velocity, y_velocity, z_velocity,
sqrt(
(x_velocity * x_velocity) +
(y_velocity * y_velocity) +
(z_velocity * z_velocity)
)::numeric(10, 2) as v_kms
from cda.impacts
where x_velocity <> -99.99
)
select * from t1
order by time_stamp;
I get data that exactly matches that shown in closed issue #43
I wonder it someone else coud check the download of cda.csv
and confirm/deny the presence of data for 2005-04-04
?
I would like to be able to get data to match that shown in the pdf it I am to carry on with the rest of the tutorial.
I accessed my own data archives and can confirm that I have the same count as the both of you for the CDA csv file. I remember when I was preparing the downloads I was worried about file sizes so I was going to trim columns and records that weren't needed (the CDA data is gigantic) which evidently was in the first release. The second, however, appears to have more records in it.
I'm still trying to figure out what's going on and I will! I normally leave myself exhaustive notes about the choices I made but I can't seem to locate anything for the CDA - mostly because I use the INMS data for the rest of the book.
To be clear: the choice was gigs and gigs of CDA data that we then pare down, or me just clipping and dropping what we need... not an easy choice and now we can see why :).
Stay tuned...
Thanks @robconery appreciate you looking at this.
Yeah I can see that the CDA extract process changed over time, which is a good thing! I don't really want to be pulling down all that RAW data :-)
nice one.
cda.csv initially downloaded when obtaining the book and result of my COPY command:
COPY operation as illustrated in the book: