Open jemrobinson opened 2 years ago
The following (from Slack) might explain why the current number of records per-day is so high.
I’ve noticed that the latest predictions show non-zero sea ice (sic_mean > 0) in every cell of the 432 x 432 grid for every date and leadtime. This feels incorrect, as i’m pretty sure that some of that space is land - can you confirm @
James Byrne
I've not yet been applying the land mask to the outputs, which I do need to do. The predictions in the south are vaguely sensible, the north might be very ropey! Good spot though, I'll sort that out tomorrow! :wink:
Recent files show 8286021
records per day for the northern hemisphere and 3094203
for the southern. It looks like there's still an issue with the masking for the northern hemisphere though (see below) so these numbers may come down further.
Since 2022-02-16
the sizes are
hemisphere | n_records | est size (MB) |
---|---|---|
north | 9070011 | 741.1 |
south | 14261829 | 1165.5 |
Records
In the forecast tables we expect a single record to take:
forecast_id
(serial4
) => 4 bytesdata_forecast_generated
(date
) => 4 bytesdata_forecast_for
(date
) => 4 bytescell_id
(int4
) => 4 bytessea_ice_concentration_mean
(float4
) => 4 bytessea_ice_concentration_stddev
(float4
) => 4 bytes Total: 24 bytes per record for dataDisk size
However, from recent measurements:
so 100000 records take 8568832 bytes => each record takes 85.68 bytes
Summary
There are around 23M records each day for the northern and southern hemispheres combined. This means: 23,000,000 * 85.68 / 1024 / 1024 = 1.84 GB per day.