Update README dataset description

I've reorganized the README to separately introduce "Analysis Ready" and "Cloud Optimized" datasets, which an expectation that users will be most interested in the former.

I've also updated all datasets with size and chunking information, generating with the following snippet:

import xarray_beam
import math

def get_size(x):
  for threshold, units in [
      (1e6, 'MB'),
      (1e9, 'GB'),
      (1e12, 'TB'),
      (1e15, 'PB'),
  ]:
    if x < threshold * 1000:
      return x/threshold, units
  raise RuntimeError('unhandled size')

for path in [
    'gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3',
    'gs://gcp-public-data-arco-era5/ar/model-level-1h-0p25deg.zarr-v1',
    'gs://gcp-public-data-arco-era5/co/model-level-wind.zarr-v2',
    'gs://gcp-public-data-arco-era5/co/model-level-moisture.zarr-v2',
    'gs://gcp-public-data-arco-era5/co/single-level-surface.zarr-v2',
    'gs://gcp-public-data-arco-era5/co/single-level-reanalysis.zarr-v2',
    'gs://gcp-public-data-arco-era5/co/single-level-forecast.zarr-v2', 
]:
  ds, chunks = xarray_beam.open_zarr(
      path, storage_options=dict(token='anon')
  )
  print()
  print(path)
  size, units = get_size(ds.sel(time=slice("1940", None)).nbytes)
  print(f'Total size (1940-present): {size:.3g} {units}')
  print('Chunks:', chunks)
  size, units = get_size(4*math.prod(chunks.values()))
  print(f'Chunk size: {size:.3g} {units}')
  print(f'Last time: {ds.indexes["time"][-1]}')

This currently outputs:

gs://gcp-public-data-arco-era5/ar/full_37-1h-0p25deg-chunk-1.zarr-v3
Total size (1940-present): 2.05 PB
Chunks: {'time': 1, 'latitude': 721, 'longitude': 1440, 'level': 37}
Chunk size: 154 MB
Last time: 2024-03-31 23:00:00

gs://gcp-public-data-arco-era5/ar/model-level-1h-0p25deg.zarr-v1
Total size (1940-present): 5.88 PB
Chunks: {'time': 1, 'hybrid': 18, 'latitude': 721, 'longitude': 1440}
Chunk size: 74.8 MB
Last time: 2024-03-31 23:00:00

gs://gcp-public-data-arco-era5/co/model-level-wind.zarr-v2
Total size (1940-present): 664 TB
Chunks: {'time': 1, 'hybrid': 1, 'values': 410240}
Chunk size: 1.64 MB
Last time: 2024-03-31 23:00:00

gs://gcp-public-data-arco-era5/co/model-level-moisture.zarr-v2
Total size (1940-present): 1.54 PB
Chunks: {'time': 1, 'hybrid': 1, 'values': 542080}
Chunk size: 2.17 MB
Last time: 2024-03-31 23:00:00

gs://gcp-public-data-arco-era5/co/single-level-surface.zarr-v2
Total size (1940-present): 2.42 TB
Chunks: {'time': 1, 'values': 410240}
Chunk size: 1.64 MB
Last time: 2024-03-31 23:00:00

gs://gcp-public-data-arco-era5/co/single-level-reanalysis.zarr-v2
Total size (1940-present): 60.9 TB
Chunks: {'time': 1, 'values': 542080}
Chunk size: 2.17 MB
Last time: 2024-03-31 23:00:00

gs://gcp-public-data-arco-era5/co/single-level-forecast.zarr-v2
Total size (1940-present): 53.2 TB
Chunks: {'time': 1, 'step': 1, 'values': 542080}
Chunk size: 2.17 MB
Last time: 2024-03-31 18:00:00

google-research / arco-era5

Update README dataset description #74